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ABSTRACT 



In this thesis we study synchronization phenomena in natural and artificial coupled 
multi-component systems, applicable to the scalability of parallel discrete-event sim- 
ulation for systems with asynchronous dynamics. We also study the role of various 
complex communication topologies as synchronization networks. We analyze the 
properties of the virtual time horizon or synchronization landscape (corresponding 
to the progress of the processing elements) of these networks by using the framework 
of non-equilibrium surface growth. 

When the communication topology mimics that of the short-range interact- 
ing underlying system, the virtual time horizon exhibits Kardar-Parisi-Zhang-like 
kinetic roughening. Although the virtual times, on average, progress at a nonzero 
rate, their statistical spread diverges with the number of processing elements, hin- 
dering efficient data collection. We show that when the synchronization topology is 
extended to include quenched random communication links (small-world links) be- 
tween the processing elements, they make a close-to-uniform progress with a nonzero 
rate, without global synchronization. This leads to a fully scalable parallel simu- 
lation for underlying systems with asynchronous dynamics and short-range inter- 
actions. We study both short-range and small-world synchronization topologies in 
one- and two-dimensional systems. We also provide a coarse-grained description 
for the small-world-synchronized virtual-time horizon and compare the findings to 
those obtained by "simulating the simulations" based on the exact algorithmic rules. 
We also present numerical results for the evolution of the virtual-time horizon on 
scale-free Barabasi- Albert networks serving as communication topology among the 
processing elements. 

Finally, we investigate to what extent small-world couplings (extending the 
original local relaxational dynamics through the random links) lead to the suppres- 
sion of extreme fluctuations in the synchronization landscape. In the absence of 
the random links, the steady-state landscape is "rough" (strongly de- synchronized 
state) and the average and the extreme height fluctuations diverge in the same 
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power-law fashion with the system size (number of nodes). With small- world links 
present, the average size of the fluctuations becomes finite (synchronized state). 
For exponential-like noise the extreme heights diverge only logarithmically with the 
number of nodes, while for power-law noise they diverge in a power-law fashion. 
The statistics of the extreme heights are governed by the Fisher-Tippett-Gumbel 
and the Frechet distribution for exponential and power-law noise, respectively. 
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CHAPTER 1 
INTRODUCTION 



1.1 Complex Networks 

Cooperative behavior and collective phenomena have always been the center 
stage of statistical physics. More recently, the study of complex systems has become 
widespread across disciplines ranging from socio-economic systems, traffic models, 
epidemic models, to the Internet, the World-Wide Web, and grid computer networks. 
With the tools and frameworks provided by modern statistical physics, and with the 
availability of rapidly increasing computational resources, there is a chance to gain 
deeper understanding of the behavior of these systems. 

One direction to study complexity is using minimal models where one consid- 
ers a large number of simple interacting entities (agents, individuals, components, 
etc.) assuming a (typically simple) effective interaction between these entities. For 
example, in the Ising model for ferromagnets, the entities are the two-state spins 
and the interaction energetically prefers neighboring spins to be aligned. In simple 
models for social systems, the entities are humans, and the interaction can be, e.g., 
mimicking (simple majority influence by their social contacts). 

While the interactions and the individual components may be simple, the col- 
lective behavior of these interacting systems are often far from trivial. For example, 
in the Ising model, in sufficiently high dimension, spontaneous order (symmetry 
breaking) emerges below some critical temperature. At the critical point the sys- 
tems becomes strongly correlated, even though the interaction between spins only 
extends to a few neighbors. These are the kind of emergent behaviors we are in- 
terested in, namely, how locally interacting entities can produce large-scale effects. 
It needs to be emphasized that in these models, complexity emerges through the 
"outcome" of the evolution of the system with a large number of entities, not in the 
construction of the individual-level ( "microscopic" ) dynamics or rules. 

Despite the great complexity and variety of systems, universal laws and phe- 
nomena are essential to our inquiry and to our understanding pQ. One way of de- 
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scribing complex systems is modeling them mathematically by using the framework 
of networks which is essentially a relational approach. A network can be defined 
as a set of items, referred to as nodes, and links connecting them. It is a concept 
borrowed from the graph theory, a subfield of combinatorics in mathematics. 

The study of complex networks pervades various areas of science ranging from 
sociology to statistical physics 013 El- Many of our important technological, infor- 
mation, and infrastructure systems can be considered complex networks IE] 
with a large number of components. The links between the nodes in these net- 
works facilitate some kind of effective interaction/dynamics between the nodes. 
Examples (with the processes inducing the interaction between the nodes) include 
high-performance scalable parallel or grid-computing networks (synchronization pro- 
tocols for massive parallelization) [Hj, diffusive load-balancing schemes (relocating 
jobs among processors) jHj, the Internet (protocols for sending/receiving packets) 
ED] , the World Wide Web (hyperlinks in the web pages for other web pages) 
[TT] , the electric power grid (generating/ transmitting power between generators and 
buses) [Zj, metabolic networks (reactions between molecules) ^2] or social networks 
(acquaintance or social contacts) fHJE]- Many of these systems are autonomous 
(by design or historical evolution), i.e., they lack a central regulator. Thus, fluctua- 
tions in the "load" in the respective network (data/state savings or task allocation 
in parallel simulations, traffic in the Internet, voltage/phase in the electric grid etc.) 
are determined by the collective result of the individual decisions of many interacting 
"agents" (nodes). As the number of processors on parallel architectures increases to 
hundreds of thousands grid-computing networks proliferate over the Internet 
jHUIIZj. or the electric power-grid covers, e.g., the North- American continent [Zj, 
fundamental questions on the corresponding dynamical processes on the respective 
underlying networks must be addressed. 

Regular lattices are commonly used to study physical systems with short-range 
interactions. Earlier studies focused mostly on the topological properties of the net- 
works. Recent works, motivated by a large number of natural and artificial systems, 
such as the ones listed above, have turned the focus to processes on networks, where 
the interaction and dynamics between the nodes are facilitated by a complex net- 
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work. The question then is how this possibly complex interaction topology influences 
the collective behavior of the system. 

A common property of many real-life networks is that the degree or connectiv- 
ity (number of connections of a node) follow a scale-free (power-law) distribution. 
Examples include world wide web, router level Internet, movie actors collaboration 
network, science collaboration network, cellular networks and linguistic networks 
[2|. Barabasi and Albert |5] introduced a growth model with preferential attach- 
ment producing scale-free networks. They added one node at every time step with 
m links and connected this node to existing nodes with a probability proportional 
to the degree of the existing nodes. This method leads to a power-law degree distri- 
bution function (having a heavier tail compared to an exponential one), P(/c) = ^pn, 
shown in Fig. II. If a). The consequence of the power- law tail in the degree distribu- 
tion is the existence of hubs, i.e., a few nodes with a large number of connections, 
often observed in real-life networks. 



(a) (b) 




k 



Figure 1.1: (a) Degree distribution of the scale-free Barabasi- Albert net- 
work with m=5, yielding an average degree of (A;)=10. The 
dashed line shows the power-law behavior in log-log scale, 
(b) Degree distribution of a SW network (Erdos-Renyi Net- 
work on a ID ring) with p=8. This p value with the additional 
two nearest-neighbor links yields an average degree (A;) =10. 
The dashed curve is a Poissonian. The delta-function with 
circles around (k)=2 is the degree distribution of the regular 
one-dimensional short-range network with only two nearest- 
neighbor links. 



Watts and Strogatz, inspired by a sociological experiment [TH] , have proposed 
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a network model known as the small-world (SW) network 19 . The SW concept 
describes the observation that, despite their often large size, there is a relatively short 
path between any two nodes in most networks with some degree of randomness. The 
SW model was originally constructed as a network to interpolate between regular 
lattices and completely random networks (2D]- Watts and Strogatz considered a 
regular short-range network with k nearest links per node. Then they randomly 
visited the links and rewired them to randomly chosen nodes with probability p. 
Thus, by varying the parameter p they were able to interpolate between a regular 
(p=0) and a completely random (p=l) network. 

Another way of constructing the SW network, instead of rewiring, is visiting 
every pair of nodes and adding a link between them with probability p/N, where N 
is the number of nodes. This construction on top of the regular network, also called 
random graph and first introduced by Erdos and Renyi j20j > have been traditionally 
used to describe the networks of random topology. The degree distribution of this 
SW graph is a Poissonian centered at the mean degree, (&)~p + 2, as shown in 
Fig- 11.1( b) with p=8. For p=0, we obtain the short-range regular network with a 
Kronecker-delta degree distribution, P(k)=5k z where z is the coordination number 
(in ID, z=2, see Fig.lOb)). 

Another important characteristic of networks is the average shortest path 
length 5 avg . The shortest path length can be defined as the minimum number of in- 
termediary nodes between two nodes. All networks with some degree of randomness 
has the property that S avg is much smaller than that of regular network with the 
same number of nodes and with the same average degree. This very short separation 
between any pair of nodes is commonly referred to as the "low degree of separation" . 
Typically the average shortest path length increases no faster than the logarithm of 
the number of nodes N. For illustration we show the average shortest path length 
as a function of system size N for SW (p=8) and BA (m=5) network in Fig. 11.21 
Note that on a d- dimensional regular network 5 avg ~ N^ es ~ Nu near where N no d es 
is the number of nodes, Nu near is the linear system size (N nodes =Nf inear ) and d is 
the dimension. 

Systems and models (with well-known behaviors on regular lattices) have been 
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Figure 1.2: The average shortest path length 5 avg as a function of system 
size iV for SW (p=8) and BA (m=5). Note normal-log scale. 

studied on SW networks, such as the Ising model (2H 1221 EH 121] , the XY model 
fIE\ . phase ordering [2B|, the Edwards- Wilkinson model [231211123 and diffusion 
|2Ill2Hll2niEniEIlE2lE3Ell- Closely related to phase transitions and collec- 
tive phenomena is synchronization in coupled mult i- component systems jHH]- SW 
networks have been shown to facilitate autonomous synchronization which is an im- 
portant feature of these networks from both fundamental and system-design points 
of view [Uni EUl EE] • m this thesis we study a synchronization problem which emerges 
ISH] in certain parallel/distributed algorithms referred to as parallel discrete-event 
simulation (PDES) |Ul EU H21 HSj ■ First, we find that constructing a SW-like syn- 
chronization network for PDES can have a huge impact on the scalability of the 
algorithm Secondly, since the particular problem is effectively "local" relax- 
ation in a noisy environment in a SW network, our study also contributes to the 
understanding of collective phenomena on these networks. 



1.2 Parallel Discrete-Event Simulation 

Simulation of large spatially extended complex systems in physics, engineer- 
ing, computer science, or military applications require vast amount of CPU-time 
on serial machines using sequential algorithms. PDES enabled researchers to imple- 
ment faithful simulations on parallel/distributed computer systems, namely, systems 
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composed of multiple interconnected computers jJOJ El GU HH] • Developing and im- 
plementing massively parallel algorithms is among the most challenging areas in 
computer/computational science and engineering [23]. While there are numerous 
technological and hardware- related points, e.g., concerning efficient message pass- 
ing and fast communications between computer nodes, the theoretical algorithmic 
challenge is often as important. 

PDES is a subclass of parallel and distributed simulations in which changes 
in the components of the system occur instantaneously from one state to another. 
In physics, chemistry and biology communities these types of simulations are most 
commonly referred to as dynamic or kinetic Monte Carlo simulations [13] ■ Exam- 
ples of such simulation systems include cellular communication networks 021 EH], 
magnetic systems [HI EH], spatial epidemic models thin-film growth [SHI EI], 
battle-field models [H2], arid internet traffic models [SH]- In these simulations the 
discrete events are call arrivals, spin-flip attempts, infections, monomer depositions, 
troop movements, and packet transmissions/receptions respectively. In these simu- 
lations the algorithm must faithfully and reproducibly keep track of the asynchrony 
of the local updates in the system's configuration. For example standard random- 
sequential Monte Carlo simulations naturally produce Poisson asynchrony. In fact, 
such continuous-time simulations (e.g., single spin-flip Glauber dynamics [311 ) were 
long believed to be inherently serial until Lubachevsky's illuminating work [5J3 EE! 
on the parallelization of these simulations without altering the underlying dynam- 
ics. The essence of the problem is to algorithmically parallelize "physically" non- 
parallel dynamics of the underlying system while enforcing causality between events 
and reproducibility. This requires some kind of synchronization to ensure causality 
between events processed by different processing elements (PEs). 

The two basic ingredients of PDES are the set of local simulated times (or 
virtual times [HZ]) and a synchronization scheme [10] • The difficulty in PDES is 
that the discrete events are not synchronized by a global clock, since the dynamic 
is usually asynchronous. There are two main approaches in PDES: (i) conservative 
synchronization, which avoids the possibility of any type of causality errors by check- 
ing if each event is safe to process [HB1EE] and (ii) optimistic synchronization, which 
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allows causality errors, then initiates rollbacks to correct the erroneous computations 
[H7| EHj • Innovative methods have also been introduced to make optimistic synchro- 
nization more efficient, such as reverse computation jBT]. Other recent improvements 
to exploit parallelism in discrete event systems are the "lookback" method [HZ. and 
the freeze- and- shift algorithm 

A PDES should have the following properties to be scalable First, a 

scalable PDES scheme must ensure that the average progress rate of the simulation 
approaches a nonzero constant in the long-time limit as the number of PEs, Npe, 
goes to infinity (computational scalability) 1 . Second, the "width" of the simulated 
time horizon (the spread of the progress of the individual PEs) should be bounded 
as Npe goes to infinity (measurement scalability) [HI]. The second requirement is 
crucial for the measurement phase of the simulation to be scalable to avoid long 
delays while waiting for "slow" nodes [SDj or, alternatively, to eliminate the need to 
reserve a large amount of memory for temporary data storage: a large width of the 
virtual time horizon hinders scalable data management. Temporarily storing a large 
amount of data on each PE (being accumulated for "on-the-fly" measurements) is 
limited by available memory while frequent global synchronizations can get costly 
for large Npe- Thus, one aims to devise a scheme where the PEs make a nonzero 
and close-to-uniform progress without global synchronization. In such a scheme, the 
PEs autonomously learn the global state of the system (without receiving explicit 
global messages) and adjust their progress rate accordingly. In this thesis we study 
regular and SW network communication topologies and show a possible way to 
construct fully scalable parallel algorithms for underlying systems with asynchronous 
dynamics and short-range interactions on regular lattices. 

Since one is interested in the dynamics of the underlying complex system, the 

PDES scheme must simulate the "physical time" variable of the complex system. 

When the simulations are performed on a single processor machine, a single (global) 

time stream is sufficient to "label" or time-stamp the updates of the local configu- 

1 The current largest supercomputer is the IBM/DOE Blue Gene/L with 32K nodes ^3- As 
a matter of fact the largest natural supercomputer is the brain, which does an immense parallel 
computing task to sustain the individual. In particular the human brain has 10 11 PEs (neurons) 
each with an average of 10 4 synaptic connections, creating a bundle on the order of 10 15 "wires" 
jammed into a volume of approximately 1400 cm 3 . 
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rations, regardless whether the dynamics of the underlying system is synchronous 
or asynchronous. When simulating asynchronous dynamics on distributed architec- 
tures, however, each PE generates its own physical, or virtual time, which is the 
physical time variable of the particular computational domain handled by that PE. 
As a result of the local stochastic time increments and the synchronization dynam- 
ics, at a given wall-clock instant the simulated virtual times of the PEs can differ, a 
phenomenon called "time horizon roughening" . We denote the simulated, or virtual 
time at PE i measured at wall-clock time t, by Tj(t). The wall-clock time t is directly 
proportional to the (discrete) number of parallel steps simultaneously performed on 
each PE, also called the number of Monte-Carlo steps (MCS) in dynamic Monte 
Carlo simulations. Without altering the meaning, t from now on will be used to 
denote the number of discrete steps performed in the parallel simulation. The set of 
virtual times {rj(t)}^ E forms the virtual time horizon (synchronization landscape) 
of the PDES scheme after t parallel updates. 

The design of efficient parallel-update schemes is a rather challenging problem, 
due to the fact that the dynamics of the simulation scheme itself is a complex system 
where the specific synchronization rules correspond to the "microscopic dynamics" , 
and its properties are hard to deduce using classical methods of algorithm analysis. 
Here we present a less conventional approach to the analysis of efficiency and scala- 
bility for the class of massively parallel conservative PDES schemes, by mapping the 
parallel computational process itself onto a non-equilibrium surface growth model 
[3§] . Then, using methods from statistical mechanics to study the dynamics of such 
surfaces (in a completely different context), we solve the scalability problem of the 
computational PDES scheme jHHSHl- Similar connections between phase transitions 
and computational complexity have recently been made [013 [00] for rollback-based 
(or optimistic) PDES algorithms [HZj and self-organized criticality [OTJIOE]- These 
connections have turned out to be highly fruitful to gain more insight into tradition- 
ally hard computational problems [0111101 • In this thesis we consider the scalability 
of conservative synchronization schemes for self-initiating processes jTU [72] , where 
update attempts on each node are modeled as independent Poisson streams and are 
independent of the configuration of the underlying system j^SlEOl- We study the 
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morphological properties of the virtual time horizon. Although these properties sim- 
plify the analysis of the corresponding PDES schemes, they can be highly efficient 
[17] and are readily applicable to a large number problems in science and engineer- 
ing. Further, the performance and the scalability of these PDES schemes become 
independent of the specific underlying system i.e., we learn the generic behavior 
of these complex computational schemes. Through our study one also gains some 
insight into the effects of SW-like interaction topologies on the critical fluctuations 
in interacting systems. 

This thesis is organized as follows. In Chapter 2 we show detailed results for 
the short-range model on one and two-dimensional regular networks with nearest- 
neighbor communication, which we refer to as the basic conservative synchronization 
(BCS) scheme p>9j. In Chapter 3 we extend our study to SW networks, constructed 
by adding random links to regular networks jHj. Chapter 4 presents the results on 
scaling and distributions of the extreme fluctuations in regular and SW networks. 
In Chapter 5 we summarize our work and discuss future directions. 



CHAPTER 2 

SYNCHRONIZATION IN REGULAR NETWORKS 

First, we briefly summarize the basic observables relevant to our analysis of syn- 
chronization and the scaling relations borrowed from non-equilibrium surface growth 
theory The set of local simulated times for the PEs, {-^(t)}^ 5 , constitutes the sim- 
ulated time horizon. Here Npe is the number of PEs and t is the discrete number of 
parallel steps, directly related to real (wall-clock) time. On a regular <i-dimensional 
hypercubic lattice NpE=N d , where N is the linear size of the lattice and d is the 
dimension. For a one-dimensional system Npe=N. In the rest of the thesis we will 
use the term "height", "simulated time", or "virtual time" interchangeably, since 
we refer to the same local observable (local field variable). 

Since the discrete events in PDES are not synchronized by a global clock, the 
processing elements have to communicate with others for synchronization. One of 
the first approaches to this problem for self-initiating processes is the basic conserva- 
tive synchronization (BCS) scheme proposed by Lubachevsky [551 by using only 
nearest neighbor interactions mimicking [SH] the interaction topology of the underly- 
ing physical system. His basic model associated each component or site with one PE 
(worst-case scenario) under periodic boundary conditions. In this BCS scheme, at 
each time step only those PEs whose local simulated time is not larger than the local 
simulated times of their next nearest neighbors are incremented by an exponentially 
distributed random amount so that the discrete events exhibit Poisson asynchrony. 
Namely, a PE will only perform its next update if it can obtain the correct informa- 
tion to evolve the local configuration (local state) of the underlying physical system 
it simulates, without violating causality. Hence, the evolution equation for site % 
simply becomes 

n{t + l) = n {t) + 7fc(*)e(-&(*))e(&+i(*)) , (2.1) 

where r)i(t) is an exponentially distributed random number, ©(...) is the Heaviside 
step-function and 4>i(t) = Ti(t) — Tj_i(£) is the local slope. In one-dimension with pe- 
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riodic boundary conditions, the network has a ring topology as shown in Fig. 12. If a), 
so each node is connected to the nearest left and right neighbors. The nearest- 




Figure 2.1: One-dimensional (ID) regular network (with periodic bound- 
ary conditions), where nodes are connected to their nearest 
neighbors. 

neighbor interaction in the BCS scheme implies that in order to ensure causality, 
PEs need to exchange information on their local simulated (virtual) times only with 
neighboring PEs in the virtual network topology. The possible configurations for 
the local simulated times for the successive nodes are shown in Fig. 12.21 In these 
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v ' i+1 
i-1 


(b) - 
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(c) 

i+1 

i-1 


(d) • 

i-1 
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Figure 2.2: Possible simulated-time configurations for three successive 
nodes (involving two successive slopes) in the basic conserva- 
tive scheme (BCS) in one dimension. From the perspective 
of node i, only configuration (b) allows it to proceed (node 
% is a local minimum). In all other cases causality could be 
violated if an update occurs at site i, because the local field 
variables of the neighboring nodes are not known at the in- 
stant of update attempt. 

configurations update occurs only if the node we are considering (node i) is a local 

minimum. In the other three cases the node i idles. In analyzing the performance of 

the above scheme, it is helpful that the progress of the simulation itself is decoupled 

from the possibly complex behavior of the underlying system. This is contrary to op- 
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timistic approaches, where the evolution of the underlying system and the progress 
of the PDES simulation are strongly entangled [HH] . making scalability analysis a 
much more difficult task. 

One of the important aspects of conservative PDES is the theoretical efficiency 
or utilization which can be defined simply as the average fraction of non-idling PEs. 
It also determines the average progress rate of the simulation. In the BCS where 
only nearest-neighbor interactions are present, the utilization is equal to the density 
of local minima in the simulated time horizon. On a regular one-dimensional lattice, 
it can be defined as 

i N i N 

(u(N, t)> = - £ (6(75..! - TiMnn - n)) = - £ (©(-^)©(^+i)> , (2.2) 
iV i=i iV i=i 

where 4 ) i =T % ~ T i~i is the local slope, ©(...) is the Heaviside step function, and 
(...) denotes an ensemble average. Note that the individual terms in the sum in 
Eq. ()2.2|) . (©(— 0j)O(0j+i)), become independent of i for a system of identical PEs 
due to translational invariance. 

Another important observable of PDES is the statistical spread or width of 
the simulated time surface. The measurement scalability of the PDES scheme, is 
characterized by the width. Instead of dealing with the actual spread (difference 
between the maximum and minimum values) we shall consider the average "width" , 
w. It is defined as the root-mean-square fluctuation of the virtual times measured 
from the mean, w = \J (w 2 ), where 

/ 1 Nd \ 
(w 2 ) = (w\N,t)) = ^—Y 1 lT l (t)-m} 2 J , (2.3) 

with f{t) = {\/N d ) YsfJi T iif) being the mean progress ("mean height") of the time 
surface, and d is the dimension. 

As we mentioned in Chapter 1, for the PDES scheme to be fully scalable, the 
following two criteria must be met: (i) the virtual time horizon must progress on 
average at a nonzero rate, and (ii) the typical spread of the time horizon should be 
finite, as the number of PEs N goes to infinity. When the first criterion is ensured for 
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large enough times t, the simulation is said to be computationally scalable, meaning 
that when increasing the size of the network to infinity, while keeping the average 
computational domain/load on a single PE the same, the simulation will progress 
at a nonzero rate. However, as we will show below, while increasing the system 
size, the spread in the time horizon can diverge, severely hindering frequent data 
collection about the state of the simulated system. Specifically, when one requires 
to take a measurement of some physical property of the simulated system at virtual 
time r, PEs have to wait (in wall-clock time) until all the virtual simulated times 
at all the PEs pass through the value of r. Thus, in order to collect system-wide 
measurements from the simulation, we incur a waiting time proportional to the 
spread, or width of the fluctuating time horizon. For PDES schemes for which the 
spread diverges with system size, the waiting time for the measurements will also 
diverge, and the scheme is not measurement scalable. When condition (ii) is fulfilled 
for large enough times t, we say that the PDES scheme is measurement scalable. 



2.1 Scaling in non-equilibrium surfaces 

Since we use the formalism and terminology of non-equilibrium surface growth 
phenomena, we briefly review scaling concepts for self-affine or rough surfaces. The 
scaling behavior of the width, (w 2 (N,t)) where iV is the linear system size and 
t is the time, alone typically captures and identifies the universality class of the 
non-equilibrium growth process EH HE] • In a finite system, the width initially 
grows as (w 2 (N,t)} ~ t 2/3 . After a system -size dependent cross-over time t x ~ N z , 
it reaches a steady-state (w 2 (N,t)) ~ N 2a for t 3> t x . In expressions above a, 
(3 and z=a/f3 are called the roughness, the growth, and the dynamic exponents, 
respectively. The above behavior can be summarized as follows 

. t 213 fort<t x , . 

w*{N,t))~{ , (2.4) 

N 2a for t»t x 

where t x ~N z is the cross-over time. From this scaling, one can also extract a length- 
scale, known as lateral correlation length, £ ~ t x l z for times less than t x , reaching 
the system size at the cross-over time. The temporal and system-size scaling of the 
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width exhibited by Eq. (J2.4j) can be captured by the Family- Vicsek j7B] relation, 

(w 2 (N,t)) = N 2a f(t/N z ). (2.5) 

Note that the scaling function f(x) depends on t and the linear system-size N only 
through the specific combination, t/N z , reflecting the importance of the crossover 
time t x . For small values of its argument f(x) behaves as a power law, while for 
large arguments it approaches a constant 

x w if x<l 

(2.6) 

const, if x^>l 

yielding the correct scaling behavior of the width for early times and for the steady- 
state, respectively. 

A somewhat less frequently studied quantity is the growth rate of a growing 
surface. This quantity is typically non- universal [SHI E3 HH1 El EH EH IB2J , but as 
was shown by Krug and Meakin |78j . on d- dimensional regular lattices, the finite- 
size corrections to it are. In the context of the basic PDES scheme, the growth rate 
of the simulated time surface corresponds to the progress rate (or utilization) of the 
simulation, hence our special interest in this observable. For the finite-size behavior 
of the steady-state growth rate, one has [TS] 

(u{N)) ~ („(oo)> + , (2.7) 

where (u(oo)) is the value of the growth rate in the asymptotic infinite system-size 
limit and a is the dimension-dependent roughness exponent of the growth process. 

2.2 One-Dimensional Basic Conservative Synchronization 
Network 

Based on a mapping between virtual times and surface site heights [HH] and on 
the analogy with the single-step surface growth model [HHl , in the coarse-grained de- 
scription [77], the virtual time horizon of the BCS is proposed to be governed by the 
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Kardar-Parisi-Zhang (KPZ) equation jHl], well-known in surface growth phenomena 



dtfi = V 2 n - A(Vf- 



+ r]i(t) 



(2. 



where V 2 fj is the discretized Laplacian, V 2 fj = fj + i+fj_i— 2fj, Vfj is the discretized 
gradient, Vfj = fj + i — fj, fj(t) = n — f is the surface height fluctuation (or virtual 
time) measured from the mean, r]i(t) is Gaussian noise delta-correlated in space and 
time, (Vi{t)i]j{t')) = 2D5ij5(t — t'), A is a positive constant, and ... stands for higher 
order irrelevant terms. Equation ()2.8j) can also give an account of a number of 
other nonlinear phenomena such as Burgers turbulence jHSj and directed polymers 
in random media [73]. When the simple update rule of the basic synchronization 
scheme is implemented on a one-dimensional network, one can observe a simulated 
time surface governed by the KPZ equation, and in the steady-state, by an Edwards- 
Wilkinson Hamiltonian [SB] [Fig. I2.3f a)]. 



(a) 



(b) 





ID Short-range (KPZ/EW) SW (p=0. 1) 

Figure 2.3: Virtual time horizon snapshots for 10,000 sites in ID. (a) 
For the regular network with nearest-neighbor connections 
(p=:0). The lateral correlation length £ and width w = y/vP 
are shown for illustration in the graph. The rough steady- 
state surface belongs to the KPZ/EW universality class, (b) 
For the SW synchronization network. The heights are effec- 
tively decorrelated and both the correlation length and the 
width are reduced, and approach system-size independent 
values for sufficiently large systems. The resulting surface is 
macroscopically smooth. Note that the heights are relative 
to the average height. 

When analyzing the statistical and morphological properties of the stochastic 
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landscape of the simulated times, it is convenient to study the height-height corre- 
lation or its Fourier transform, the height-height structure factor. The equal-time 
height-height structure factor S(k,t) in one-dimension is defined through 

N5 k) - k 'S(k,t) = (f h (t)f k/ (t)) , (2.9) 

where f k = SjLi e~ lk ^Tj is the Fourier transform of the virtual times with the 
wave number k=2irn/N , n—1,2, ...,N — 1 and 5 k - k > is the Kronecker delta. The 
structure factor essentially contains all the "physics" needed to describe the scal- 
ing behavior of the time surface. Here we focus on the steady-state properties 
(t— >oc) of the time horizon where the structure factor becomes independent of time, 
lim^oo S(k, t)—S(k). In the long-time limit, in one dimension, for a KPZ surface 
described by Eq. (12. 8|) one has (see Appendix A) [ZZj 

S(k) = — r ~ — 7TTT ~ i , (2.10) 

v ; 2[1 - cos(fc)] k 2 ' v ; 

where D is a constant and the latter approximation holds for small values of k. By 
performing the inverse Fourier transformation of Eq. (j2.10j) , we can also obtain the 
spatial two-point correlation function, 

1 N 1 
G(l) = T7T,G^i = j,T,^ l S(k), (2.11) 

where Gj ii+ ;=((r i — f)(r i+ i — f)) is the site-dependent two point function, yielding 

[ZZHSH 



2 V6 

for 1 / L. In particular, for the steady-state width one finds 

<^ 2 > = ^E S(k) = G(0) ^^-N-N (2.13) 

in one dimension [77j. This divergent width is caused by a divergent length scale, 
£, the "lateral" correlation length in the KPZ-like synchronization landscape. 

The measured steady-state structure factor [Fig. 12.4( a)]. obtained by simu- 
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lating the BCS based on the exact rules for the evolution of the synchronization 
landscape confirms the coarse-grained prediction for small k values, S(k) ~ 1/k 2 . 
Figure |23tb) shows the corresponding spatial two-point correlation function, G(l). 
Simulation of the BCS scheme in one dimension yields scaling exponents that agree 

(a) (b) 



260 




Figure 2.4: (a) The steady-state structure factor as a function of the 
wave number for the BCS scheme in ID. The small-A; 
course-grained prediction (consistent with the steady-state 
EW/KPZ universality class in ID) is indicated by a dashed 
line [Eq. (|2.10|) ]. Note the log-log scales, (b) Steady-state 
spatial two-point correlation function. The straight line again 
indicates the asymptotic EW/KPZ behavior in one dimension 

[Eq. «232D]. 

within error of the predictions of the KPZ equation [73*1 HU Ell- The time evolu- 
tion of the width [Fig. 12.5( a)] shows that the growth exponent (3 ~ 1/3. Looking 
at the system-size dependence of the steady-state width [Figure l2"3T b)]. we find 
the roughness exponent a ~ 1/2, consistent with the one-dimensional KPZ value, 
(w 2 ) ~ N 2a ~ N. The dynamic exponent values found from the width as a function 
of the cross-over time and z=a//3 are the same, about 3/2. The inset in Fig. 12.5( a) 
shows that the scaled version of the width evolution by using the scaling exponents 
is consistent with the Family- Vicsek relation [Eq. (|2.5|) ]. although with relatively 
large corrections to scaling. 

The steady-state width distributions, P(w 2 ), have been introduced to provide 
a more detailed characterization of surface growth processes [BE EHJ EH3 EI] and have 
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(a) (b) 




Figure 2.5: (a) Time evolution of the width for different system-sizes in 
the BCS scheme in ID. The dynamics of the evolution of 
the virtual times is governed by the KPZ equation. The 
inset shows the same data on scaled axes, (w 2 )/N 2a versus 
t/N z . Each curve has been obtained by averaging over at 
least fifty realizations, (b) The steady-state width of the time 
horizon for the one-dimensional BCS as a function of system- 
size. The dashed straight line represents the asymptotic one- 
dimensional KPZ/EW behavior, (w 2 (N)) ~ N 2a with a=l/2. 

been used to identify universality classes [3U]. Note that the width 

1 N d 

w " = jvaEfcW - r(t)] 2 , (2-14) 

itself is a fluctuating quantity The width distribution for the EW (or a steady-state 
one-dimensional KPZ) class is characterized by a universal scaling function, 
such that P{w 2 ) = (w 2 )^ 1 ^(w 2 / (w 2 )), where can be calculated analytically 

for a number of models, including the EW class [HE]- The width distribution for 
the basic synchronization scheme is shown in Fig. 12.61 Systems with N > 10 3 show 
convincing data collapse onto this exact scaling function. The inset in Fig. 12.61 
shows the same graph in log-normal scale to show the collapse at the tail of the 
distribution. The convergence to the limit distribution is very slow when compared 
to other microscopic models (such as the single-step model [73J EH]) belonging to 
the same KPZ universality class. 

Now we discuss our findings for the steady-state utilization of the BCS scheme. 
As stated above, the synchronization landscape of the virtual times belongs to the 
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Figure 2.6: ] 

Scaled width distributions for the BCS scheme in ID. The exact asymptotic 
EW/KPZ width distribution [SB] is shown with a dashed line. The inset shows 
the same distributions on log-normal scales. 

EW universality class in one dimension. This implies that the local slopes in the 
steady-state landscape are short-range correlated [77] . Hence the density of lo- 
cal minima in the synchronization landscape, and in turn the utilization, remains 
nonzero in the infinite system-size limit jSHllZZI- For a fixed N, the utilization drops 
from relatively higher initial value at early times to its steady-state value in a very 
short time [Fig. 12.7( a)]. Further, the steady-state utilizations for various systems 
converge to the asymptotic system-size independent value. In ID, since a=0.5 the 
utilization, by using Eq. ()2.7|) as a function of system size, becomes 

com 

(„(iV)> ~ (ti(oo)) + (2.15) 

as shown in Fig. I2.7f b). For the KPZ model [Eq. ()2.8[) ] (u(oo)) = 1/4, since in 
the steady state the slopes are delta-correlated, resulting in a probability 1/4 for 
the configuration in Fig. 12.2( b). corresponding to a local minimum. For the actual 
BCS synchronization profile (tt(oo)) ~ 0.2464 (SHIIZZ], as a result of non-universal 
short-range correlations present for the slopes in the specific microscopic model [STJ 
as can be seen in Fig. 12.81 

In summary, we have shown that the ID BCS time horizon belongs to the 
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(a) (b) 




l/N 



Figure 2.7: (a) Utilization in the ID BCS scheme as a function of time 
for various system sizes, (b) Steady-state utilization as a 
function of the l/N 2 ^ 1 "^ as suggested by Eq. (J2.7|) with the 
ID KPZ roughness exponent a— 1/2. The dashed line is a 
linear fit, (u(N)) w 0.2464 + 0.2219/iV 

KPZ universality class as N goes to infinity, then the measurement part of the ID 

BCS scheme is not scalable. 



2.3 Two-Dimensional Basic Conservative Synchronization 
Network 

A natural generalization to pursue is the synchronization dynamics and the 
associated landscapes on the networks in higher dimensions. One might ask whether 
PDES of two-dimensional phenomena exhibit kinetic roughening of the virtual time 
horizon. Preliminary results indicated that this is the case [T6*l 19*2*] . In this section 
we give detailed results when the BCS scheme is extended into a two-dimensional 
lattice in which each node has four nearest neighbors. We consider a system with 
periodic boundary conditions in both axes as can be seen in Fig. 12.9( a). 

The same microscopic rules, i.e., each node increments its local simulated time 
by an exponentially distributed random amount when it is a local minima among its 
nearest neighbors, are applied to this lattice. As in the one-dimensional case, during 
the evolution of the local simulated times correlations between the nodes develop in 
the system. One observes a rough time surface in the steady-state of the 2D BCS 
network. Figure 12.10( a) shows the contour plot of the simulated time surface for 
BCS scheme in 2D. In 2D as well, we observe kinetic roughening of the BCS scheme. 
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Figure 2.8: Slope-slope correlation function for the BCS scheme on a ID 



The simulated time surface for a finite system roughens with time in a power-law 
fashion. It then saturates after some system-size dependent crossover time to its 
system-size dependent steady-state value, as shown in [Fig. l2.lfT a)]. Our estimate 
for the growth exponent in the early-time regime is /3=0.125, significantly smaller 
than that of one dimension. 

The roughness exponent a for KPZ-like systems have been measured and es- 
timated in a number of experiments and simulations |73]. Since exact exponents for 
the higher- dimensional KPZ universality class are not available, for reference, we 
compare our results to a recent high-precision simulation study by Marinari et al. 

on the restricted solid-on-solid (RSOS) model j^l], a model believed to belong 
to the KPZ class. They found in [HS] that a~0.39 for the 2D RSOS roughness ex- 
ponent. While our simulations of the virtual time horizon show kinetic roughening 
in Fig. 12. lir a), the scaled plot, suggested by Eq. ()2.5|h indicates very strong correc- 
tions to scaling for the BCS in 2D (inset). Figure I2.11f b) and Fig l2. 121 also indicates 
that the (KPZ) scaling regime is approached very slowly in the steady-state, which 
is not completely unexpected: for the ID BCS scheme as well, convergence to the 
steady-state roughness exponent [Fig. 12.5( a)] and to the KPZ width distribution 
[Fig. 12.5( b)] only appears for linear system sizes N > 0(1O 3 ). Here, for the 2D case, 



short-range network for various system sizes. 
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(a) (b) 




Figure 2.9: Communication topologies in 2D. (a) 2D regular network, 
where each node is connected to its four nearest neighbors, 
(b) SW synchronization network. One random link per node 
is added on top of the 2D regular network. Arrowed lines 
show the bidirectional random links between the nodes. 



(a) (b) (c) 




Figure 2.10: Synchronization landscapes as contour plots for the 2D BCS 
on regular and SW networks of 128x128 nodes, (a) for the 
BCS scheme on a regular lattice with only nearest-neighbor 
connections (equivalent top=0); (b) for p=0.1; (c) for p=1.0. 

the asymptotic roughness scaling [Fig. 12.11T b)] and width distribution [Fig. 12.12] 
has not been reached for the system sizes we could simulate (up to linear system 
size iV=4096). Nevertheless the trend in the finite-size behavior, and the identical 
microscopic rules (simply extended to 2D) suggest that 2D BCS landscape belongs 
to the 2D KPZ universality class. 

For further evidence, we also constructed the structure factor for the 2D BCS 
steady-state landscape. As shown in Fig. I2.lffl a). S(k x , k y ) exhibits a strong singu- 
larity about k=0. For further analysis, we exploited the symmetry of S(k x , k y ) that 
it can only depend on |k| = Jk\ + k£. Hence, we averaged over all directions having 
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t N 



Figure 2.11: (a) Time evolution of the width in 2D BCS scheme. The 
dashed line indicates the power-law behavior of the width 
before saturation with a growth exponent /3«0.125. The inset 
shows the scaled plot (w 2 ) /N 2a vs. t/N z . (b) Steady-state 
width of the 2D BCS scheme as a function of the linear 
system-size. The dashed line corresponds to the asymp- 
totic 2D KPZ scaling with roughness exponent 2a=0.78 as 
obtained by high-precision simulations of the RSOS model 
|93j. Note the log-log scales. 




Figure 2.12: The scaled width distributions for the 2D BCS scheme. The 
solid curve is the asymptotic 2D KPZ scaled width distribu- 
tion, again from high-precision RSOS simulations |95j. Note 
the log-normal scales. 
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Figure 2.13: Structure factor for the BCS scheme in 2D. (a) as a func- 
tion of the wave number components k x and k y . (b) as a 
function of the magnitude of the wave number for differ- 
ent system-sizes. The dashed line shows the asymptotic 2D 
KPZ behavior for small values of |k| [Eq. (|2.16|) ] with a=0.39 
|93j. Note the log-log scales. 

the same wave number |k| to obtain ( S r (|k|) . For small wave numbers we found that 

it diverges as 

5 (l k D - ^ > ( 2 - 16 ) 

with a = 0.39, as shown in Fig. 12.13T b). This is consistent with the small-fc behavior 
of the structure factor of the 2D KPZ universality class with roughness exponent 
a~0.39 [93 . As noted above, the scaling of the width and its distribution exhibited 
very slow convergence to those of our reference-KPZ system, the RSOS model [§31 
I95j . This is likely the effect of the non- universal and surprisingly large contributions 
coming from the large-fc modes, leading to very strong corrections to scaling for the 
system sizes we were able to study in 2D. Looking directly at the small- |k| behavior 
of S'Qkl) is undisturbed by the larger-|k| modes, hence the agreement with the 2D 
KPZ scaling is relatively good. 

The steady-state utilization (density of local minima) in the 2D BCS synchro- 
nization landscape approaches a nonzero value in the limit of infinite number of 
nodes, (w(oo)) ~ 0.1201 as can be seen in Fig. l2.l4T a). This is consistent with the 
general approximate behavior (u(oo)) ~ const./ d on hypercubic lattices in d dimen- 
sion [HUES; i.e., (w(oo)) is approximately inversely proportional to the coordination 
number. The system-size dependence of the steady state utilization in the 2D BCS 
also follows Eq. (|2.7j) . As shown in Fig. l2.14T b). for a two-dimensional BCS scheme, 
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Figure 2.14: (a) The time evolution of the steady-state utilization in 2D 
BCS scheme for various system sizes, (b) The steady-state 
utilization in the 2D BCS scheme as a function of l/jV 2 ^ 1 "") as 
suggested by Eq. (|2.7|) with the 2D KPZ roughness exponent 
a=0.39. The dashed line is a linear fit, (u(N)) » 0.1201 + 
0.1585/iV L22 . 



the utilization becomes 



(u(N)) ~ (u(oo)) + 



const. 



(2.17) 



where we have used the 2D KPZ roughness exponent a=0.39 [93 . 

We have seen that similar to the ID case, the 2D BCS scheme also exhibits 
a finite progress rate but the width diverges as the system size goes to infinity, 
hindering measurement scalability. 



2.4 The K-random Synchronization Network 

In order to obtain an analytically tractable scalability model for the BCS, 
Greenberg et al introduced the iCrandom interaction network model f64| . In this 
model at each update attempt PEs compare their local simulated times to those 
of a set of K randomly chosen PEs. This set is rechosen for each update attempt 
(i.e., the network is "annealed"), even if a previous update attempt has failed. It 
was shown that in the limit of t—>-oo and N^oo, the utilization (or the average rate 
of progress) converges to a non-zero constant, 1/(K+1) [Fig. !2.15T a)]. They also 
suggested that the scaling properties of iCrandom model as t^oo and iV— >oo are 
universal and hold for regular lattices as well. But changing the interaction topology 
from the nearest neighbor PEs on a regular lattice to randomly chosen PEs changes 
the universality class of the time horizon. Simply, the underlying topology has a 
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crucial effect on the universal behavior of the time horizon. The random (annealed) 
interaction topology of the i^-random model results in a mean-field-like behavior, 
where the simulated time surface is uncorrelated and has a finite width in the limit 
of an infinite number of PEs [Fig. l2.l5T b)]. Their conjecture for the width does not 
hold, thus, the BCS scheme for regular lattices cannot be equivalently described by 
the i^-random model (at least not below the upper critical dimension of the KPZ 
universality class [95J). 
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Figure 2.15: The steady-state observables in f^-random network, (a) uti- 
lization as a function of system size, (b) width as a function 
of system size. 

However, we were inspired by jM] to change the communication topology of the 
PEs by introducing random links in addition to the necessary short-range connec- 
tions. In the next chapter we present our modification to the original conservative 
scheme on regular lattices to achieve a fully scalable algorithm where both scalability 
conditions are satisfied. 



CHAPTER 3 

SYNCHRONIZATION IN SMALL- WORLD NETWORKS 



The divergent width of the synchronization landscapes in regular networks for very 
large systems, as discussed in the previous chapter, is the result of the divergent 
lateral correlation length £ of the virtual time surface reaching the system size N 
in the steady-state [Z31 HH1 EH1 IHU E2 • To de-correlate the simulated time horizon, 
first, we modify the virtual communication topology of the PEs. The resulting 
communication network must include the original short-range (nearest-neighbor) 
connections to faithfully simulate the dynamics of the underlying system. In the 
modified network, the connectivity of the nodes (the number of links of a node) 
should remain non-extensive (i.e., only a finite number of virtual neighbors per 
node is allowed). This is in accordance with our desire to design a PDES scheme 
where no global intervention or synchronization is employed (PEs can only have 
0(1) communication exchanges per step). It is clear that the added synchronization 
links (or at least some of them) have to be long range. Short range links alone would 
not change the universality class and the scaling properties of the width of the time 
horizon. One can satisfy this condition by selecting the additional links (called 
small-world links) randomly among all the nodes in the network. Also, fluctuations 
in the individual connectivity should be avoided for load balancing purposes, i.e., 
requiring the same number of added links (e.g., one) for each node is a reasonable 
constraint. 

One may wonder how the collective behavior of the PDES scheme would change 
if each node was connected to the one located at the "maximum" possible distance 
away from it (N/2 on a ring) [Fig. l3.ir a)]. Consider a linear coarse-grained Langevin 
equation with Gaussian noise where the effective strength of the added long-range 
links is 7, 

d t n(t) = (r i+1 + - 27}) - 7(7; - T i+N/2 ) + i]i(t) , (3.1) 

with periodic boundary conditions. Since Eq. ()3.1|) is translationally invariant, 
Fourier transformation decouples the equations for different wave numbers k and 
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(a) X 











Figure 3.1: (a) Maximal-distance network in ID where each node is con- 
nected to nearest neighboring nodes and to the node which 
is at the maximum distance, (b) Fully connected network 
where each node is connected to every other node in the net- 
work, (c) Small-world (SW) synchronization network in ID, 
where each node is connected to a randomly chosen one in 
addition to the nearest neighbors. 

one obtains for the steady-state structure factor (see Appendix A) 



S(k) 



D 



2[1 - cos(Jfe)] + 7[1 - cos(fciV/2)] 



(3.2) 



where k = (2nn)/N, n — 0,1,2, ... , N—l as before (and N is even for simplicity). 
Then for the average width we find 
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N ^ 2[1 - cosffc)] + 7 [1 - cos(fciV/2)] " 
Separating the terms with even and odd n values above, we find 
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2[1 - cos(27rn/AT)] 



(3.4) 



The first sum yields a finite iV-independent value in the N—>oo limit. The second 
sum, on the other hand, is identical to the width of the EW model on a regular net- 
work of size N/2. Thus, in the large N limit the width for the "maximal-distance" 
connected network [Fig. 13. If a)] diverges as (w 2 (N))~DN/24:. Indeed, one can real- 
ize, that such regularly patterned long-range links make the network equivalent to a 
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2x(iV/2) quasi one- dimensional system with only nearest-neighbor interactions and 
helical boundary conditions. The above extreme case suggests, that the maximal- 
distance synchronization network cannot work either. 

Now instead, consider the scenario where each node is connected to every 
other node in the network by a "weak" link, i.e., constructing a "fully" connected 
network as shown in Fig. 13.1( b) . In this case we can rewrite the Langevin equation 
in Eq. (|3.1|) by using an effective strength of links, j/N, 

dMt) = (r m + r^x - 1r % ) - j- ^(r, - Tj ) + rn(t) . (3.5) 

i=i 

Performing the summation above yields an exact mean-field-like coupling, where 
each node is coupled to the average height: 

d t n{t) = (r i+1 + Ti_x - 2r<) - 7(7^ - f) + ifc(t) , (3.6) 

where f = Y$Li T j is the average height. For the steady-state structure factor one 
finds (see Appendix A) 

S(k) = -. ^77^ • (3.7) 

v ; 2[l-cos(ife)] + 7 

Then by using the relation between the structure factor and the width [Eq. 13 .3j one 

obtains 

1 _ 1 D r°° dk D D 

{W) = N £ b{k) = N ^ 2[l-cos(ifc)]+7 " J-- 2^WT^ = Vf ' (3 - 8) 

The above relation between the mean-field coupling constant 7 and the width shows 
that for a non-zero 7 the width is finite in the thermodynamic limit. But connecting 
each node in PDES to every other node would be cost-inefficient and cumbersome in 
terms of communication times. As we discuss in the next section, one can construct 
an "effectively" fully connected and yet cost-efficient network which has a finite 
width and relatively high progress rate by only employing a few random links. 
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3.1 One-Dimensional Small- World-Connected Synchroniza- 
tion Network 

As we have seen in Chapter 2 our attempts to make the PDES fully scalable 
have failed because the PDES on short-range network is not measurement scalable 
(width is infinite for an infinite system). One of the proposed networks discussed 
in the previous section, the maximal-distance network, fails as a candidate for a 
fully scalable synchronization scheme because it is effectively equivalent to a short- 
range network. On the other hand, the fully connected network, is very inefficient in 
performance although it is measurement scalable. Motivated by the social networks 
we propose a network topology in which each node is connected to exactly one 
another randomly chosen node in addition to the nearest neighbors, resulting in a 
SW-like synchronization network. As we shall see, adding one random link to every 
node is cost-efficient and makes the network an "effectively" fully-connected one. 

One of the basic structural characteristics of SW-like networks is the "low 
degree of separation" between the nodes. The most commonly used observables 
to analyze this property are the average shortest path length, 5 avg (N), and the 
maximum shortest path length, S max (N). The shortest path length between two 
nodes is defined as the minimum number of nodes one has to visit in order to go 
from one of the nodes to the other. The average shortest path length is the average 
of all these possible shortest paths between the nodes in the network. The maximum 
shortest path length, also known as diameter of the network, is the length of the 
longest among the shortest paths in the network. Both of these observables scale 
logarithmically with the system-size N in SW-like networks [HE]. The system-size 
dependence of these path lengths for our one- dimensional SW network, in which we 
have both nearest neighbors and random SW links, is logarithmic as expected, see 
Fig.E2Ka)-(b). 

We now describe the modified algorithmic steps for the SW-connected PEs 
[Hj. In the PDES on SW synchronization network, in every parallel time step each 
PE with probability p compares its local simulated time with its full virtual neigh- 
borhood, and can only advance if it is the minimum in this neighborhood, i.e., if 
T~i(t) < min{rj_i(t), r i+ i(t), T r (i)(t)}, where r(i) is the random connection of PE i. 




Figure 3.2: (a) Average and maximum shortest path lengths as a function 
of the number of nodes for the SW synchronization network 
in ID, as described in the text. The latter is also referred to 
as the diameter of the network. The solid and dashed lines 
both indicate the logarithmic dependence. Note the normal- 
log axes, (b) Histogram for the length of the shortest paths 
in one realization of the network with system size iV=10 4 . The 
dashed curve is appropriately fitted Gumbel distribution for 

5 S — a 

minima in the form f(5) = \e~^~ e b where a=12.2 and 6=1.48. 

With probability (1 — p) each PE follows the original scheme, i.e., the PE then can 
advance if Tj(i) < min{Tj_i(t), Tj + i(t)}. Our network model including the nearest 
neighbors and random SW links can be seen in Fig. l3.lf c). Note that the occasional 
extra checking of the simulated time of the random neighbor is not needed for the 
faithfulness of the simulation. It is merely introduced to control the width of the 
time horizon. The occasional checking of the virtual time of the random neighbor 
(with rate p) introduces an effective strength J = J(p) for these links. Note that this 
is a dynamic "averaging" process controlled by the parameter p and can possibly 
be affected by nonlinearities in the dynamics through renormalization effects. The 
exact form of J(p) is not known. The only plausible properties we assume for J is 
that it is a monotonically increasing function of p and is only zero when p=0. 

In what follows, we focus on the characteristics of the dynamics on the network. 
As we have seen for the one-dimensional ring, the communication protocol between 
the nodes (up to linear terms) leads to simple relaxation, governed by the Laplacian 
on the regular grid. Random communication links give rise to analogous effective 
couplings between the nodes, corresponding to the Laplacian on the random part of 
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the network. Thus, the large-scale properties of the virtual time horizon of our SW 
scheme are governed by the effective Langevin equation 

dth = V 2 f, - ~ + - + Th(t) , (3.9) 

j 

where the ... stands for infinitely many non-linear terms (involving non-linear inter- 
actions through the random links as well), and is proportional to the symmetric 
adjacency matrix of the random part of the network: Jij=J(p) if sites i and j are 
connected by a random link and Jij—0 otherwise. For our specific SW construction 
each node has exactly one random neighbor, i.e., there are no fluctuations in the 
individual connectivity (degree) of the nodes. Our simulations (to be discussed be- 
low) indicate that when considering the large-scale properties of the systems, the 
Laplacian on the random part of the network generates an effective coupling 7 to 
the mean jH] . At the level of the structure factor, it corresponds to an effective mass 
7 (in a field-theory sense) 

7 + Ar 

where 7=7(p) is a monotonically increasing function of p with 7(0)=0. 

We emphasize that the above is not a derivation of Eq. (|3.10p . but rather 
a "phenomenological" description of our findings. It is also strongly supported 
by exact asymptotic results for the (linear) EW model on SW networks, where 
the effect of the Laplacian on the random part of the network is to generate a 
mass |2Z1 121] • The averaging over the quenched network ensemble, however, can 
introduce nontrivial scaling and corrections in the effective coupling fZT \ l2*£ | I2TJ]. In 
our case, this is further complicated by the nonlinear nature of the interaction. The 
results of "simulating the simulation", however, suggest that the dynamic control 
of the link strength and nonlinearities only give rise to a renormalized coupling and 
a corresponding renormalized mass. Thus, the dynamics of the BCS scheme with 
random couplings is effectively governed by the EW relaxation in a small-world 
[23 EH EH] • From Eq. (|3.10|) it directly follows that the lateral correlation length in 
the infinite system-size limit 

£~7~ 1/2 , (3.11) 
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i.e., becomes finite for all p ^0 [Fig. 12.3( b)]. The presence of the effective mass term 
in the structure factor Eq. (j3.1()j) implies that limfc-^o S(k) < oo, that is, there are 
no large amplitude long-wavelength modes in the surface. Consequently, the width 
(w 2 ) = {l/N)J2k^oS(k) is also finite. Our simulated time landscapes indeed show 
that they become macroscopically smooth when SW links are employed [Fig. 12.3( b)]. 
compared to the the same dynamics with only short-range links [Fig. 12.3( a)], 



(a) (b) 




k k 



Figure 3.3: Structure factor for the ID SW synchronization scheme with 
(a) p—0.1 and (b) p=l. The insets show l/S(k) vs. k 2 for 
small values of k, confirming the coarse-grained prediction 
Eq. flXTDD. 

In the simulations, we typically performed averages over 10-100 network real- 
izations, and compared the results to those of individual ones. Our results indicate 
that the observables we studied (the width and its distribution, the structure factor, 
and the utilization) display strong self- averaging properties, i.e., for large enough 
systems, they become independent of the particular realization of the underlying 
SW network. Simulation results for the structure factor, S(k), for the SW syn- 
chronization scheme are shown in Fig. 13.3( a) and (b). If an infinitesimally small p 
is chosen, S(k) approaches a finite constant in the limit of A; — ^-0, and in turn, the 
virtual time horizon becomes macroscopically smooth with a finite width. 

A possible (phenomenological) way to obtain the correlation length is to fit 
our structure factor data to Eq. ()3.1U|) . more specifically, by plotting 1/S(k) versus 
k 2 , which exhibits a linear relationship. By a linear fit, 7 is then the ratio of the 
intercept and the slope (insets in Fig. 13. 3j) . Alternatively, one can confirm that 
the massive propagator Eq. (|3.1U|) indeed leads to an exponential decay in the two- 
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Figure 3.4: The spatial two-point correlation function as a function of 



the Euclidean distance / between the nodes for two different 
values of p, indicating an exponential decay with an average 
correlation length £~27 and £~1.6 for p=0.1 and p=1.0, respec- 
tively. The number of nodes is iV=16,384. The inset shows 
the same data in log-normal scale. 



point correlation function from which the correlation length can also be extracted 
[Fig. 13. 4j . In our case with a system-size iV=16,384, £~27 for p=0.1 and £?»1.6 for 
p=l. Figure l3~5T c) shows the correlation length extracted from the structure factor 
S(k) as a function of p for different system-sizes. 

An alternative way to determine the correlation length is using the finite-size 
scaling of the width (w 2 ). From dimensional analysis it follows that (w 2 ) has length 
dimension in ID. There are two length scales in the system: the linear system size 
N and the correlation length £ of an infinite system. For p — 0, (w 2 )~N, while for 
p>0 and iV— >oo, (u> 2 )~£. For non-zero p and finite N the scaling of the steady-state 
width can be expected [2H] to follow 




(3.12) 



where g(x) is a scaling function such that 




const, if x3>l 



(3.13) 



For non-zero p and for sufficiently small systems (iV<C£(p)) one can confirm that 
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P 

Figure 3.5: Correlation length (£) vs. p for different system-sizes. The 
dashed line corresponds to the estimate of the exponent s, 
~ V ~ s with s^0.84, in the small-p regime for an asymptot- 
ically infinite system. 

the behavior of the width follows that of the system without random links (w 2 )~N 

[Fig. 13.6( a)]. For large-enough systems, on the other hand, we can extract the 

p-dependence of the infinite-system correlation length as (w 2 )~ £(p) [Fig. 13.6( b)]. 

yielding 

t(p)~p- s , (3.14) 

where s~0.84. 

We then studied the data collapse as proposed by Eq. ()3.12j) by plotting (w 2 )/N 
vs. l/p s N. In fact, we performed this rescaling originating from both raw data sets 
Fig. 13.6( a) and (b). The resulting scaled data points in Fig. 13.7( a) and (b), of 
course, are identical in the two figures, but the lines connect data points with the 
same value of p in Fig. 13.7( a) and with the same value of N in Fig. 13.7( b). These 
scaled plots in Fig. 13.71 indicate that there are very strong corrections to scaling: 
data for larger p or smaller N values "peel off" from the proposed scaling form in 
Eq. ()3.13j) relatively quickly. These strong corrections are possibly the result of the 
nonlinear nature of the interaction between the nodes on the quenched network. We 
note that the linear EW model on identical networks exhibits the scaling proposed 
in Eq. (j3.12|) and Eq. ()3.13j) without noticeable corrections [2H] [Fig. 13.8( a) and (b)]. 

The non-zero 7, leading to a finite correlation length, £, ensures a finite width 
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(a) 



(b) 





Figure 3.6: (a) The average steady-state width in the ID SW synchro- 
nization landscape as a function of the system size for differ- 
ent values of p in the range of [10 _3 ,10 -1 ]. The dashed line 
indicates the EW/KPZ scaling, corresponding to the small 
system-size behavior, (b) The average steady-state width in 
the ID SW synchronization landscape as a function of p for 
different values of N in the range of [40, 10 4 ]. The dashed line 
indicates the best-fit power law in the asymptotic large-iV 
small-p regime to extract the correlation length exponent s, 
according to Eqs. <^T^)-<^T^. 

in the infinite system-size limit. Our simulations show that the width saturates to 

a finite value for p>0 [Fig. 13.9( a)]. The distribution of the width P{w 2 ) changes 

from the EW/KPZ distribution to a delta function for non-zero values of p as the 

system size goes to infinity. Figure 13*. 1 Of a) and (b) shows the width distributions for 

p=0.1 and p=l, respectively. The scaled width distributions (to zero mean and unit 

variance), however, exhibit the convergence to a delta function through nontrivial 

shapes for different values of p. For p=0.1 [Fig. 13.11( a)] the distributions appear to 

slowly converge to a Gaussian as the system-size increases. For p=l [Fig. 13.11( b)]. 

the trend is opposite up to the system sizes we could simulate; as the system-size 

increases, the distributions exhibit progressively non-Gaussian features (closer to an 

exponential) around the center up to iV=10 6 . Note that not only the average width 

(w 2 ), but also the full distribution P{w 2 ) is self-averaging, i.e., is independent of the 

particular realization of the underlying SW network. 

To get some insight into the possible role of the disorder in approaching the 

limit distribution of the width, we studied the two-point function for individual pair 

of nodes. Note, that by construction, the observable previously considered, G(l), 

is the site (or spatially) averaged two-point function over all nodes with Euclidean 
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l/p s N l/p s N 
Figure 3.7: The scaled versions of Fig. 13.6( a) and (b), as proposed by 



Eq. (ETT211 and Eq. (fSTTlj) by plotting (w 2 ) /N vs. 1 /p s N with 
s=0.84. The data points are identical in (a) and (b), but data 
points connected by a line have the same value of p in (a) 
and have the same value of N in (b), as obtained by rescaling 
Fig. 13.61( b) and (b), respectively. The dashed line corresponds 
to the asymptotic small- x behavior of the scaling function g[x) 
[Eq. iH]. 

distance /, G(/)=^ Y<iLi(( T i — ^)( T i+i — t)). If the height values on the nodes for a 
fixed network realization are sufficiently weakly correlated, the width distribution 
should converge to a Gaussian, governed by the central limit theorem jH3 98J . As 
we saw above, for larger values of p, this may not be the case, at least for finite 
systems. 

In order to have some measure how the individual terms in the width are 
correlated, w 2 =(l/N) Y,iLi( T i ~ ^) 2 > we constructed the two-point function for all 
sites i for a few chosen separation / [Fig. I3.12j for a fixed network realization. Of 
course, as already discussed, averaging over all i, will yield an exponential decay as 
a function of I. Now, instead, we focus on the full two-point correlation "profile" 
for a given separation I. As can be seen in Fig. 13.12( a). for p—0.1, the node- 
to-node fluctuations in the two-point correlation profile, compared to their spatial 
average G(l), are small. With the increasing strength of the disorder (p=l), however, 
certain sites develop abnormally large, frozen correlations as shown in Fig. 13.12T b). 
The deviation of the two-point correlations for these few nodes, from the mean 
G(l), is much larger than those for the other nodes, and is comparable to the mean 
itself. This property can work "against" the necessary conditions of the central limit 
theorem and, thus, can have a strong effect on the convergence (or the apparent lack 
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Figure 3.8: (a) Steady-state width for linear EW model as a function 
of the system-size from [28J for comparison. The solid lines 
from are from theoretical calculations and the symbols from 
simulations. The dashed line shows the behavior of the 
width when there is no SW links, namely the BCS scheme 
(short-range network) (b) The scaled version of (a) by us- 
ing Eq. (ETH^l) and Eq. (f3TT5j) . For the linear EW model the 
scaling does not have any noticeable corrections. 

of it) of the width distribution to a Gaussian. 

The effect of the random communication links on the utilization can be un- 
derstood as follows. According to the algorithmic rules, the virtual times of the full 
network neighborhood (including the random neighbor) are checked with probabil- 
ity p, while with probability (1—p) only short-ranged synchronization is employed. 
Thus, the average progress rate of the simulated times becomes 



(u) 



p)(6(-0 l _ 1 )e(0 J )) +p(e(-0 l _ 1 )6(0 l )e(r r(l) -r,)) . 



(3.15) 



Note that the disorder (network) averaging makes the right hand side indepen- 
dent of i. In the presence of the SW links the regular density of local minima 
(0(— 0j_i)O(0j)) remains nonzero (in fact, increases compared to the short-range 
synchronized BCS scheme) [HI IH3 IHHj- Thus, for an infinitesimally small p, the 
utilization, at most, can be reduced by an infinitesimal amount, and the SW- 
synchronized simulation scheme maintains a nonzero average progress rate. This is 
favorable in PDES where global performance requires both finite width and nonzero 
utilization. With the SW synchronization scheme, both of these objectives can be 
achieved. For example, for p=0.1 (-u(oo)) ~ 0.242, while for p=1.0 (w(oo)) ~ 0.141. 
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N 

Figure 3.9: The steady-state width as a function of system size for ID 
SW synchronization network as a function of system size. 
The width saturates to finite values as long as p ^ 0. The 
dashed line shows the power-law divergence for BCS in ID. 
Note the log-log scales. 

The steady-state utilization as a function of system size for various values of p can 

be seen in Fig. 13.131 

3.2 Two-dimensional Small- World- Connected Synchroniza- 
tion Network 

The de-synchronization (roughening of the virtual time horizon) again mo- 
tivates the introduction of the possibly long-range, quenched random communica- 
tion links on top of the 2D regular network. Each node will have exactly one (bi- 
directional) random link as illustrated in Fig. 12.9( b). The actual "microscopic" rules 
are analogous to the ID SW case: with probability p each node will check the local 
simulated times of all of its neighbors, including the random one, and can increment 
its local simulated time by an exponentially distributed random amount only if it 
is a "local" minimum (among the four nearest neighbors and its random neighbor). 
With probability (1— p), only the four regular lattice neighbors are checked for the 
local minimum condition. 

The effect of the synchronization through the random links is, again, to stop 
kinetic roughening and to suppress fluctuations in the synchronization landscapes. 
Contour plots of the synchronization landscapes are shown in Fig. 12.101( b) and (c) 
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Figure 3.10: Steady-state width distributions in the ID SW synchroniza- 
tion scheme for (b) p=0.1 and for (c) p=1.0. The distri- 
butions were constructed using ten different network real- 
izations, except for iV=10 6 , where only one realization was 
obtained due to computational limitations. All width dis- 
tributions, however, indicated self-averaging. 
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Figure 3.11: Steady-state width distributions for the ID SW synchro- 
nization scheme scaled to zero mean and unit variance for 
(a) p=0.1 and (b) p=1.0. The dashed curves are similarly 
scaled Gaussians for comparison. 

for p=0.1 and p=l, respectively. Our results indicate that for any nonzero p the 

width of the surface approaches a finite value in the limit of N—kx> [Fig. IH.14T a)]. At 

the same time, the distribution approaches a delta-function in the large system-size 

limit as shown in Fig. I3.15f a) and (b). The scaled distributions (to zero mean and 

unit variance) again show that at least for the finite systems we observed, the shape 

of these distribution differs from a Gaussian [Fig. IH.lBT a) and (b)]. The deviation 

from the Gaussian around the center of the distribution is stronger for a larger value 
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Figure 3.12: Two-point correlation function for each node i for various 
separation / for a fixed ID SW synchronization network with 
size iV=16,384 (a) p=0.1 and (b) p—1. The horizontal lines 
correspond to spatial average G(l) for each /. 

of p where the influence of the quenched random links are stronger. Note that for 

the ID SW landscapes as well, the width distribution only displayed a crossover 

to Gaussian behavior for smaller values of p and for very large linear system sizes 

(N > (9(10 4 ). In the 2D SW case, these linear system sizes are computationally not 

achievable, and the convergence to a Gaussian width distribution remains an open 

question. 

The underlying reason for the finite width is again a finite average correlation 
length between the nodes. The 2D structure factor exhibits a massive behavior, 
i.e., S'(lkl) approaches a finite value in the limit of k^O [Fig. 13.17( a) and (b)]. 
For small wave numbers, the approximate behavior of the structure factor is again 
SQkl) oc l/(|k| 2 + 7) as can be seen in the inset of Fig. I3.17f b). with strong finite- 
size corrections to 7. The relevant feature of the synchronization dynamics on a SW 
network is the generation of the effective mass 7. Nonlinearities can give rise to a 
renormalized mass, but the relevant operator is the Laplacian on the random part 
of the network. 

In the 2D SW synchronization scheme the steady-state utilization is smaller 
than its purely 2D counterpart (BCS in 2D), as a result of the possible additional 
checking with the random neighbors. For small values of p, however, it is reduced 
only by a small amount, and remains nonzero in the limit of an infinite number of 
nodes [Fig. 13.18] , For example, forp=0.1 (m)oo~0.1198, while for p=l (m) oo ~0.084. 
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Figure 3.13: Steady-state utilization of SW synchronization network in 
ID as a function of system-size for three different values of 
p=0 (BCS), p=0.1 and p=l. 

3.3 Synchronization in scale-free networks 

The Internet is a spontaneously grown collection of connected computers. The 
number of (only) webservers by February 2003 reached over 35 million [100] . The 
number of PC-s in use (Internet users) surpassed 660 million in 2002, and it is 
projected to surpass one billion by 2007 jlOlj . The idea for using it as a giant 
supercomputer is rather natural: many computers are in an idle state, running at 
best some kind of screen-saver software, and the "wasted" computational time is 
simply immense. Projects such as SETI@liome or the GRID consortium J7j are 
targeting to harness the power lost in screen-savers. 

Most of the problems solved currently with distributed computation on the 
Internet are "embarrassingly parallel" , i.e., the computed tasks have little or no 
connection to each other similar to starting the same run with a number of different 
random seeds, and at the end collecting the data to perform statistical averages. 
However, before complex problems can be solved in real time on the Internet a 
number of challenges have to be solved, such as the task allocation problem which 
is rather complex by itself [66 . 
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Figure 3.14: (a) The average steady-state width as a function of linear 
system size for different p values in the 2D SW synchroniza- 
tion scheme. The data for p=0 corresponds to the 2D BCS 
scheme on a regular network with only nearest-neighbor 
connections. 
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Figure 3.15: Steady-state width distributions in the 2D SW synchroniza- 
tion scheme for (a) p=0.1 and for (b) p=1.0 for various system 
sizes. 

Here one can ask the following question: Given that task allocation is resolved 
and the PE communication topology on the internet is a scale-free network, what are 
the scalability properties of a conservative synchronization scheme on such networks? 
Here we present numerical results, for the conservative PDES scheme, as measured 
on a model of scale- free networks, namely the Barabasi- Albert model (BA) Q02J . 
This network is created through the stochastic process of preferential attachment: 
to the existing network at time t of N nodes, attaches the N+l st node with m 
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Figure 3.16: Steady-state width distributions in the 2D SW synchroniza- 
tion scheme scaled to zero mean and unit variance for (a) 
p=0.1 and for (b) p=1.0. The dashed curves are similarly 
scaled Gaussians for comparison. 
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Figure 3.17: (a) Steady-state structure factor for the 2D SW scheme as a 



function of the wave-vector components k x and k y for p=1.0. 
(b) Structure factor as a function of magnitude of the wave 
vector, |k|, for the 2D SW synchronization scheme for p=1.0. 
The inset shows 1/S(k) vs. |k| 2 for small values of |k|. 

links ("stubs") at time t+1, such that each stub attaches to a node with probability 

proportional to the existing degree of the node. Here we will only present the 

m—1 case, when the network is a scale- free tree. Once we reach a given number of 

nodes in the network, we stop the process and use the random network instance to 

simulate the synchronization, using the evolution equation [Eq. (J2HJ) for the time 

horizon. While in case of regular topologies, the degree of a node is constant, e.g. 

for (i-dimensional "square" lattices, P^ Nd \k) = Idb^^d , fo r the BA network, it is 

a power law in the asymptotic (iV— >oo) limit : P BA (k) ~ 2m 2 k~ 3 . The condition 
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Figure 3.18: Steady-state utilization of SW synchronization network in 
2D as a function of system-size for three different values of 
p=0 (BCS), 0.1 and 1. 

for a site to be updated, i.e., that its virtual time is a local minimum, is a local 

property, and thus we expect that the utilization itself be correlated with local 

structural properties of the graph, such as the degree distribution. Figure 13.191 

shows the steady state (t— >oo, on a fixed BA network of iV nodes) values of the 

average utilization as function of the network size N. Notice that strictly speaking, 

the conservative PDES scheme is computationally non-scalable. An empirical fit 



suggest that (u(N)) = (u(N,t—oo)) 



with a ~ 3.322 and b = 0.902, 



In [aN^ 

i.e., the computation is only logarithmically (or marginally) non-scalable. For a 
system of iV=10 3 nodes we have found a steady state utilization (for the worst case 
scenario) of (w)=0.1328 (13.3% efficiency), while for a system of a million nodes, 
iV=10 6 , the utilization dropped only to (w)=0.073 (7.3% efficiency), by less than half 
of its value. For practical purposes the conservative PDES scheme can be considered 
computationally scalable, and this type of non-scalability we will call logarithmic (or 
marginal) non-scalability. 

Figure EUni shows the scaling of the width of the fluctuations for the time hori- 
zon as function of time, and the scaling of its value in the steady-state as function of 
system size (inset). Notice, that while the steady state width diverges to infinity, it 
only does so logarithmically, (w 2 (N,t=oo)} ~ In (cN d ^ with c~1.25 and <i~0.401. 
Some specific values: (u> 2 (10 3 , t=oo)) ~ 3.01, (u> 2 (10 5 , t=oo))~4.78. This means 
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Figure 3.19: Steady-state utilization for the scale-free BA network 

that the measurement phase of the conservative PDES scheme on a scale-free net- 
work is non-scalable either, however, it is so only logarithmically, and for practical 
purposes the scheme can be considered scalable. Overall, the conservative PDES 
scheme has logarithmic (or marginal) non-scalability on scale-free networks. 
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t 

Figure 3.20: Behavior of the time horizon width for the scale-free BA 
network (scale-free tree with m=l). The inset shows the 
scaling of the steady-state width as a function of system 
size, N (note the log-normal scale on the inset). 



CHAPTER 4 
EXTREME FLUCTUATIONS IN SMALL- WORLD 

NETWORKS 

Large fluctuations in networks are to be avoided for many reasons such as scala- 
bility or stability. In the absence of global intervention or control, this can be a 
difficult task. Motivated by the results in Chapter III jH] for small-world (SW) [TTi] 
synchronized autonomous systems in the context of scalable parallel computing, we 
investigate the steady-state properties of the extreme fluctuations in SW-coupled 
interacting systems with relaxational dynamics |1Q3| 1104] . Since the introduction 
of SW networks ^H] it has been well established that such networks can facilitate 
autonomous synchronization [3^1 EZ1 1105] . In addition to the average "load" in the 
network, knowing the typical size and the distribution of the extreme fluctuations 
106} 1107] 1108] is of great importance from a system-design viewpoint, since failures 
and delays are triggered by extreme events occurring on an individual node. 

Relationship between extremal statistics and universal fluctuations in corre- 
lated systems has been studied intensively PUJ HUH HID IEE21 QUI EH EES HUH 
riTRl EH EH1 [Ml H221 Ola EE2H TM ■ Th e focus of a number of these studies was 
to find connections, if any, between the probability distribution of global observ- 
ables or order parameters (such as the width in surface growth problems [ZHj or the 
magnetization in magnetic systems [126j ) and known universal extreme- value limit 
distributions |106t 11071 H08] . Recent analytic results demonstrated |118t 1121] that, 
in general (except for special cases |116[ 1117] ). there are no such connections. Here 
we discuss to what extent SW couplings (extending the original dynamics through 
the random links) lead to the suppression of the extreme fluctuations of the local 
order parameter or field variable in various noisy environments. We illustrate our 
findings on the actual PDES synchronization problem in scalable parallel comput- 
ing [Sj. In Sec. 14.11 we review the well-known extreme- value limit distributions for 
exponential-like and power-law-tail distributed random variables. In Sec. 14. HI we 
discuss the results [1031 1104] on the scaling behavior of the extreme fluctuations and 
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their distribution, for the Edwards- Wilkinson model [HE! on SW networks [27| with 
exponential-like noise. In Sec. !4.Hl we apply these results to study the extreme load 
fluctuations in SW-synchronized PDES schemes j4TH I41j. applicable to high perfor- 
mance parallel architectures and large-scale grid- computing networks. In Sec. 14.41 
we extend our studies ll()4j and consider the synchronization problem in the 
presence of power-law tailed noise. 

4.1 Extreme- Value Distributions for Independent Random 

Variables 

4.1.1 Exponential-like variables 

First, we consider the case when the individual complementary cumulative 
distribution P> (x) (the probability that the individual stochastic variable is greater 
than x) decays faster than any power law, i.e., exhibits an exponential-like tail in 
the asymptotic large- x limit. (Note that in this case the corresponding probability 
density function displays the same exponential-like asymptotic tail behavior.) We 
will assume P>(x) ~ e~ cx& for large x values, where c and 5 are constants. Then 
the cumulative distribution P™ ax (x) for the largest of the N events (the probability 
that the maximum value is less than x) can be approximated as \1'2'2\ I127[ [128 

p- ax (x) = [P^x)] 1 " = [1 - P>(x)} N = e "Mi--P>(*)] ~ e -NP>(x) ; 

where one typically assumes that the dominant contribution to the statistics of 
the extremes comes from the tail of the individual distribution P>(x). With the 
exponential-like tail in the individual distribution, this yields 

P^ ax (x) ~e- e_c ^ +lnW . (4.2) 

The extreme-value limit theorem states that there exists a sequence of scaled vari- 
ables x = (x — ojv)/fcjv, such that in the limit of N—>oo, the extreme-value probabil- 
ity distribution for x asymptotically approaches the Fisher-Tippett-Gumbel (FTG) 
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distribution [TTMITU7] : 

P™ x (x) ~ e- e ~" , (4.3) 

with mean (x)=7=0.577. . . (Euler constant) and variance a\ = (x 2 ) — (x) 2 = ir 2 /Q. 
From Eq. (|4.2|) . one can deduce 2 [128 that to leading order the scaling coefficients 



are a at 



ln(jV) 



1/6 



and 6 at = (5c) 1 w y . The average value of the largest 



of the iV original variables then scales as 



(imax) = a N + 6iv7 



ln(iV) 



.1/5 



(4.4) 



(up to 0(j^nvj) correction) in the asymptotic large- iV limit. When comparing with 
experimental or simulation data, instead of Eq. (J4.3|) . it is often convenient to use 
the form of the FTG distribution which is scaled to zero mean and unit variance, 
yielding 

P< ax (y) = e- e ~ iav+ ^ , (4.5) 

where a = it/ VQ and 7 is the Euler constant. In particular, the corresponding FTG 
density then becomes 

p™ x (y) = ae- {ay+ ^- e ~ {av+l) . (4.6) 

4.1.2 Power-law tailed variables 

Now consider independent identically distributed random variables where the 
tail of the complementary cumulative distribution decays in a power law fashion, i.e., 
P>(x) ~ A/x^ 1 for large values of x. Assuming again that the dominant contribution 
to the statistics of the extremes comes from the tail of the individual distribution 
P22EE2I1II2B], Eq.fDJ yields 

P™ ax (x) ~ e - NP>(x) ~ e' NA/x " . (4.7) 

Introducing the scaled variable x = x/b^, where &at = (AN) 1 ^, yields the standard 

form of the so called Frechet distribution for the extremes in the asymptotic large- N 

2 Note that for 6^=1, while the convergence to Eq. H4.2(l is fast, the convergence for the appro- 
priately scaled variable to the universal FTG distribution Eq. H4.8|) is extremely slow. 
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limit fTMfTUH] 

P™*(x) = e- 1 ^ , (4.8) 
and the corresponding probability density 

P maX ( £ ) = S^i^ ■ ( 4 - 9 ) 

One can note that the tail behavior of the extremes has been inherited from that 
of the original individual variables, i.e., p max (x) ~ l/x^ +l for large values of x. The 
first moment of the extreme exist if \x > 1 and for the average value of the largest 
of the N original power-law variables one finds 

(x max ) = 6^(5) ~ r(l - l/MAN) 1 '" ~ N 1 ^ (4.10) 

where T(z) is Euler's gamma function. For comparison with experimental or sim- 
ulation data it is often convenient to use an alternative scaling for the extremes 
y = x/(x max ), yielding collapsing (iV-independent) probability density functions 
similar to Eq. (|4.9jl 

fpax/ \ _ H -1/(1^(1-1//^) (a 1 -I \ 

p yy> r^i-v/x)^ 1 ■ 1 J 

4.2 Extreme Fluctuations in ID BCS Network 

We consider again the simplest stochastic model with linear relaxation on a 
SW network used in the previous chapter [Eq. (|3.9|) ]. In this chapter in addition to 
the width, we will study the scaling behavior of the largest fluctuations (e.g., above 
the mean) in the steady-state 

(A max ) = (r max - f ) . (4.12) 

As discussed in Chapter 2 and 3, Eq. (|3.9J) (and its generalization with a KPZ- 
like nonlinearity [8*4*j ) governs the steady-state progress and scalability properties of 
a large class of PDES schemes (HI EHl E3 E3 H29j . In this context, the local height 
variables {Tj(t)}^ 1 correspond to the progress of the individual processors after t 
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parallel steps. The EW/KPZ-type relaxation at a coarse-grained level originates 
from the "microscopic" (node-to-node) synchronizational rules. In the absence of 
the random links with purely short-range connections, the corresponding steady- 
state landscape is rough [73] (de-synchronized state), i.e., it is dominated by large- 
amplitude long-wavelength fluctuations. The extreme values of the local fluctuations 
emerge through these long-wavelength modes and, in one dimension, the extreme 
and average fluctuations follow the same power-law divergence with the system size 

fm rrmi \mi wm 1129; 



(A max ) ~ w ~ N c 



(4.13) 



where a is the roughness exponent [73 3 [Fig. 14.1( a)]. 
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Figure 4.1: (a) Scaling behavior of the average (w) and the extreme 
(above the mean A max , below the mean (f — r m j n ), maximum 
spread {r max — r TO j„)) fluctuations in the virtual time horizon 
for the conservative PDES scheme in the steady state. The 
processors are connected in a ring-like fashion. Note the log- 
log scales. The dashed line represents the theoretical power 
law with the roughness exponent a=l/2. (b) The same quan- 
tities as in (a), but the processors are connected by a small- 
world topology and the additional synchronization through 
the random link is performed with probability p=0.10 at ev- 
ery parallel step (log-normal scales). The solid straight line 
indicates the weak logarithmic increase of the extreme fluc- 
tuations with the system size. 

The extreme-value limit theorems sketched in the previous section are valid 

only for independent (or short-range correlated) random variables. Since the heights 

are strongly correlated in the ID BCS scheme, the known extreme-value limit theo- 
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rems cannot be used. A recent work on this issue sheds some light on the distribution 
of the extreme heights in the ID BCS \l'24\ I125j . Equation (j4.13j) suggests that the 
normalized probability density function of the maximum relative height A max has a 
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Figure 4.2: The scaled distributions of maximum relative height of the 
ID BCS for different system-sizes. The dashed line is the 
Airy distribution function [124J. The inset is the log-normal 
version of the same data to show the agreement with the 
theoretical curve in the tail . 

universal scaling form, P(A max , N)^N~ a f(A max /N a ). For the ID EW/KPZ with 
periodic boundary conditions (a=l/2), by using path integral techniques |124j f(x) 
was found to be the so-called Airy distribution function. Our simulation results 
show that the appropriately scaled maximum relative height distributions are in 
agreement with the theoretical distribution from |124j [Fig. 14 .2j . 



4.3 Extreme Fluctuations in Small- World-Connected Net- 
work 

The important feature of the EW model on SW networks is the develop- 
ment of an effective nonzero mass 7(p), corresponding to an actual or pseudo gap 
in a field theory sense fI7\ |3UJ H3Uj . generated by the quenched-random structure 
[2*7] . In turn, both the average correlation length £ ~ [7(p)] _1//2 and the width 
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w ~ (l/v^2)[7(p)] _1,/4 approach a finite value (synchronized state) and become self- 
averaging in the N—kx> limit [OH]- Thus, the average correlation length becomes 
finite for an arbitrarily small but nonzero strength of the random links (one such 
link per site). This is the fundamental effect of extending the original dynamics to a 
SW network: it decouples the fluctuations of the originally correlated system. Then, 
the extreme-value limit theorems can be applied using the number of independent 
blocks N/ £ in the system |122t I128j . Further, if the tail of the noise distribution 
decays in an exponential-like fashion, the individual relative height distribution will 
also do so 3 , and depends on the combination Ai/w, where A, = r— f is the relative 
height measured from the mean at site i. Considering, e.g., the fluctuations above 
the mean for the individual sites, we will then have P > (A i ) ~ exp[— c(Aj/w) 5 ], 
where P>(Aj) denotes the "disorder-averaged" (averaged over network realizations) 
single-site relative height distribution, which becomes independent of the site i for 
SW networks. From the above it follows that the cumulative distribution for the 
extreme-height fluctuations relative to the mean A max =r max — f , if scaled appropri- 
ately, will be given by Eq. (j4.r>j) [or alternatively by Eq. (|4.5jl ] in the asymptotic 
large- N limit (such that N/£^%>1). Further, from Eq. ()4.4j) . the average maximum 
relative height will scale as 



(A max ) ~ w 



MAYOr^ rw , ni i/ 5 



.1/6 



where we kept only the leading order term in N. Note, that both w and £ approach 
their finite asymptotic iV-independent values for SW-coupled systems. Also, the 
same logarithmic scaling with N holds for the largest relative deviations below the 
mean (f— r m j n ) and for the maximum spread (r max — r m i n ). This weak logarithmic 
divergence, which one can regard as marginal, ensures synchronization for practi- 
cal purposes in SW coupled multi-component systems with local relaxation in an 
environment with exponential-like noise. 

To study the extreme fluctuations of the SW-synchronized virtual time hori- 



3 The exponent S for the tail of the local relative height distribution may differ from that of the 
noise as a result of the collective (possibly non-linear) dynamics, but the exponential-like feature 
does not change. 
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zon, we "simulated the simulations", i.e., the evolution of the local simulated times 
based on the above exact algorithmic rules By constructing histograms for Aj, 

we observed that the tail of the disorder-averaged individual relative-height distribu- 
tion decays exponentially (5=1) [Fig. 14. 3j . Then, we constructed histograms for the 
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Figure 4.3: Disorder-averaged probability densities for the local height 
fluctuations for the SW-synchronized (j9=0.10) landscape for 
three system sizes indicated in the figure. Note the log- 
normal scales. The solid straight line indicates the expo- 
nential tail. 

extreme-height fluctuations Fig. 14.4( a). The scaled histograms, together with the 
similarly scaled FTG density Eq. (|4.6|) . are shown in Fig. 14.4( b). We also observed 
that the distribution of the extreme values becomes self-averaging, i.e., independent 
of the network realization. Figure 14.1( b) shows that for sufficiently large N (such 
that w essentially becomes system-size independent) the average (or typical) size of 
the extreme-height fluctuations diverge logarithmically, according to Eq. ()4.14|) with 
5=1. We also found that the largest relative deviations below the mean (f— r min ), 
and the maximum spread (r max — r min ) follow the same scaling with the system size 
N [Fig. l4.lf b)]. Note, that for our specific system (PDES time horizon), the "micro- 
scopic" dynamics is inherently nonlinear, but the effects of the nonlinearities only 
give rise to a renormalized mass j(p) (leaving 7(p)>0 for all p>0) [H]. Thus, the 
dynamics is effectively governed by relaxation in a small world, yielding a finite 
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Figure 4.4: (a) Disorder-averaged probability distributions for the ex- 
treme height fluctuations for the SW-synchronized conserva- 
tive PDES time horizons with p=0.10 for three system sizes 
indicated in the figure. Note the log-normal scales, (b) The 
same as (a), but the probability densities are scaled to zero 
mean and unit variance. The solid curve corresponds to the 
similarly scaled FTG density Eq. (J4.6|) for comparison. 

correlation length and, consequently, the slow logarithmic increase of the extreme 

fluctuations with the system size [Eq. (|4.14j) ]. Also, for the PDES time horizon, the 

local height distribution is asymmetric with respect to the mean, but the average 

size of the height fluctuations is, of course, finite for both above and below the 

mean. This specific characteristic simply yields different prefactors for the extreme 

fluctuations [Eq. 1)4.14)1 ] above and below the mean, leaving the logarithmic scaling 

with N unchanged. 

4.4 Synchronization in the Presence of Power-Law Noise 

Employing SW-like synchronization networks to suppress large fluctuations 
was successful in the presence exponential-like "noise". We now investigate the 
scenario when the noise distribution exhibits a power-law tail. We consider the syn- 
chronization problem from parallel discrete-event simulations for power-law tailed 
asynchrony. The condition for updating the local "height" variables in the synchro- 
nization landscape (corresponding to the local virtual times) is unchanged, i.e., a 
node is only allowed to increment its local simulated time 7$ if it is a local min- 
imum in the virtual neighborhood (possibly including the random neighbor with 
probability p). The increment, however, is now drawn from a power-law probability 
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density p{rj) ~ l/rf +l . Since the above local update rule is, essentially, relaxation 
on the network, this model also serves as a prototypical model for relaxation on 
SW networks in an environment with power-law noise. The above synchronization 
rules can be applied to simulating systems with non-Poisson asynchrony, relevant 
to various transport and transmission phenomena in natural and artificial systems 
[13H I132(. I133j . For example, in Internet or WWW traffic, in part, as a result of uni- 
versal power-law tail file-size distributions |134[ I135j , service times exhibit similarly 
tailed distributions in the corresponding queuing networks [136. 11371 1138] . In turn, 
when simulating these systems, the corresponding PDES should use power-law tail 
distributed local simulated time increments. 

For a purely one-dimensional connection topology (in the absence of the ran- 
dom links) we observed kinetic roughening. Since the time to reach the steady, the 
relaxation time in the steady state, and the surface width all diverge with the num- 
ber of nodes, it is difficult to measure the roughness exponent accurately. It is well 
documented [?3] , however, that KPZ-like growth in the presence of power-law noise 
leads to anomalous roughening (yielding a roughness exponent greater than 1/2 in 
ID). 
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Figure 4.5: Snapshot for the SW-synchronized (p=0.10) landscape in a 
power-law noise environment (7=3) for iV=10 4 nodes. 

Here we show and discuss results for the power-law noise generated growth on 
SW networks. We have chosen two values of 7 governing the tail of the probability 
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Figure 4.6: Disorder-averaged probability distributions in a power-law 
noise environment (7=3) with iV=10 4 nodes for the local 
height fluctuations for three system sizes indicated in the 
figure. Note the log-normal scales. The inset shows the same 
(for the positive domain) on log-log scales. The solid line 
corresponds to the slope of \i + 1 ~ 3. 

density for the noise: 7=3 and 7=5. For both of these cases the noise have a finite 

mean and variance. One can expect a power-law tail (at least for above the mean) for 

the probability density of the individual local height fluctuations p(A;) ~ 1/Af +1 , 

once the noise is "filtered through" the collective dynamics. In Fig. 14.51 we show a 

snapshot for the resulting synchronization landscape, indicating the presence of some 

rare but very large fluctuations above the mean. Since the local update rules lead 

to nonlinear (KPZ-like) effective interactions, we could not predict the exponent 

of the local height distribution. Instead, we constructed histograms representing 

£>(Aj). For the above two values of the noise exponent, 7=3 and 7=5, we observed 

power-law tail exponents for p(Ai) ~ 1/Af +1 as well, but with exponents clearly 

differing from that of the noise, /i~2 and /^~4, respectively. Figure ll~o1 shows p(Ai) 

for the former. The figure indicates that for large A; a power-law tail develops, while 

fluctuations below the mean exhibit an exponential-like tail. This asymmetry is due 

to the asymmetry in the microscopic update rules: local minima were incremented 

by power-law distributed random amount, hence anomalously large deviations above 

the mean can emerge. 

Then we analyzed the scaling behavior of the average and the extreme height 



59 



fluctuations in the associated synchronization landscape. In the limit of large N, w 
becomes system-size independent, while the extreme- height fluctuations above the 
mean diverge in a power-law fashion according to Eq. ()4.10j) [Fig. 14.7] . Fitting a 
power law for iV>10 3 yields (A max ) ~ TV - 47 and (A max ) ~ jV°- 25 for the two cases 
analyzed in Fig. 14.7] for 7=3 and 7=5, respectively. In order to understand the 
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Figure 4.7: Steady-state scaling behavior of the average (w) and the ex- 
treme (A max ) fluctuations in the synchronization landscape 
for power-law noise with exponent 7=3 (filled symbols) and 
7=5 (open symbols) using SW links with p=0.10. The straight 
solid-line segments are the best-fit power laws ((A max ) ~ N 1 ^ 
with 1/^=0.47 and = 0.25, respectively) for the extreme 
fluctuations for system sizes N > 10 3 , according to Eq. (|4.10]) . 

underlying reason for this divergence, we analyzed the histograms constructed for 

the probability density of the extreme height fluctuations p(A max ) [Fig. 14.8] . The 

shapes of these histograms suggest that the limit distribution is of Frechet type. We 

constructed the histograms for the scaled variable y = A max / ( A max ) . Then using 

[A = 1/0.47 = 2.1 and \i = 1/0.25 = 4 as implied by the scaling behavior of (A max ) 

[Eq. (l4~TUj) ]. we plotted the similarly scaled Frechet density Eq. (I4~TT1) [Fig. OTb)]. 

These results indicate that the effect of the random links in SW networks is again 

to decouple the local field variables, an in turn, the statistics of the extremes are 

governed by the Frechet distribution. Consequently, the average size of the extremes 
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diverges in a power-law fashion (A max ) ~ N 1 ^. This power-law divergence is not 
the result of a divergent length scale emerging from the cooperative effects of the 
interacting nodes. On the contrary, the local field variables become effectively in- 
dependent using SW synchronization. The tail behavior for them (power-law with 
a possibly different exponent), however, is inherited from the noise. Hence, the 
statistics of the extremes will be of the Frechet type, yielding a power-law increase 
of the average size of the largest fluctuations above the mean. 



(a) (b) 




Figure 4.8: (a) Disorder-averaged probability distributions for the ex- 
treme height fluctuations for the SW-synchronized (p=0.10) 
landscape in a power-law noise environment for 7=3 (filled 
symbols) and 7=5 (open symbols) for three system sizes indi- 
cated in the figure. Note the log-log scales, (b) Scaled form 
of probability densities in Fig. 14. 8L The solid curves corre- 
spond to the similarly scaled Frechet density Eq. (J4.11|) for 
comparison. 

The above picture is reasonably consistent in that the exponents for the tail 
behavior (~1/Af +1 ) for both the probability density of the local heights p(\) and 
the extremes p(A max ) were within about 6%. Further, the average size of the ex- 
tremes increases as N 1 ^, in accordance with the underlying Frechet distribution. 

It is interesting to note that for /i~2, formally, p(Aj) does not have a finite 
variance (associated with the width w = y 1 {{ji — f ) 2 ) = J ((A3) 2 )). Indeed, in 
the simulations we observed large fluctuations in w and error bars of the order of 
the width itself. The "theoretical" divergence for /i=2 is, of course, limited by the 
logarithm of a large but finite cutoff in the simulations. This anomalous (formally 
divergent) width is not related to a system-size dependent widening of the individual 
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distributions controlled by a divergent correlation length. Rather, the individual 
distributions develop a heavy-tailed shape independent of the system size. 

Examining the largest fluctuations below the mean reveals that they increase 
only logarithmically with the system size. This is simply the result of the exponential 
or similar tail of the individual local height fluctuations below the mean [Fig. 14. 5j . 
where the governing limit distribution is of the Gumbel type (Sec. 14.1. ip . 

The above results show, that SW synchronization can be efficient to control the 
average size of the fluctuations, but the largest fluctuations still diverge in a power- 
law fashion with the number of nodes. While the SW-network effectively decouples 
the fluctuations in the synchronization landscape, it cannot suppress power-law tails 
already present in local noise distribution. In fact, the inherited power-law tails for 
the local height fluctuations are even "heavier" than that of the corresponding noise, 
/i«2 for 7=3 and /i~4 for 7=5. 



CHAPTER 5 
SUMMARY AND FUTURE WORK 



5.1 Summary 

We studied synchronization phenomena in networks in general, and consid- 
ered the scalability problem of the basic conservative synchronization schemes to 
test our results. Based on a mapping between the evolution of the virtual 
time horizon for the basic conservative PDES scheme [531 EH] and kinetically grown 
non-equilibrium surfaces [73], we constructed a coarse-grained description for the 
scalability and performance of such large-scale parallel simulation schemes. These 
schemes can be applied to large spatially extended systems with short-range inter- 
actions and asynchronous dynamics. The one-site-per PE basic PDES was shown 
to exhibit KPZ-like kinetic roughening. This scheme is scalable in that the average 
progress rate of the PEs approaches a non-zero value. The spread of the virtual 
time horizon, however, diverges as the square root of the number of PEs, leading to 
"de-synchronization" and difficulties in data management. 

In this work we considered the simplest (and in some regards, the worst case) 
scenario, where each nodes carries one site of the underlying physical system, hence 
synchronization with nearest neighbor PEs is required at every step. In actual 
parallel implementations the efficiency can be greatly increased by hosting many 
sites by each PE |47| l50| l5T] . That way, communication between PEs is only required 
when local variables are to be updated on the boundary region of the sites hosted 
by the PEs (within the finite range of the interactions). While the above procedure 
clearly increases the utilization and reduces the actual communication overhead, 
it gives rise to an even faster growing early time regime in the simulated time 
horizon Since the PEs rarely need to synchronize, up to some crossover 

time, the evolution of the time horizon is governed by random deposition [73], a 
faster roughening growth, before eventually crossing over to the KPZ growth and a 
subsequent saturation. 

Our goal here was to achieve synchronization without any global intervention. 
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We constructed a specific version of the SW network, where each PE was connected 
to exactly one other randomly chosen PE. The extra synchronizational steps through 
the random links are merely used to control the width. The virtual time horizon for 
the SW-synchronized PDES scheme becomes "macroscopically" smooth and essen- 
tially exhibits mean-field like characteristics. The random links, on top of a regular 
lattice, generate an effective "mass" for the propagator of the virtual time horizon, 
corresponding to a nonzero correlation length. The width becomes finite, for an ar- 
bitrary small rate of synchronization through the random links, while the utilization 
remains nonzero, yielding a fully scalable PDES scheme. The former statement is 
only marginally weakened by observing that the extreme fluctuations in the time 
horizon can exhibit logarithmically large values as a function of the total number 
of PEs. The above predictions of the coarse-grained PDES model were confirmed 
by actually "simulating the simulations". The generalization when random links 
are added to a higher- dimensional underlying regular lattice is clear: since the syn- 
chronization landscape of the ID SW network is already macroscopically smooth, 
in higher dimensions it will be even more so [22] (i.e., the critical dimension of the 
underlying regular substrate is less than one). 

We also studied the scalability properties for a causally constrained PDES 
scheme hosted by a network of computers where the network is scale-free following 
a "preferential attachment" construction |5| ll()2j . Here the PEs have to satisfy the 
general criterion that their simulated time should be smaller than that of all of their 
links' simulated times in order to advance their local time. Despite some nodes 
in the network having abnormally large connectivity (as a result of the scale-free 
nature of the degree distribution), we found that the computational phase of the 
algorithm is only marginally non-scalable. The utilization exhibited slow logarith- 
mic decay as a function of the number of PEs. At the same time, the width of the 
time horizon diverged logarithmically slowly, rendering the measurement phase of 
the simulations marginally non-scalable as well. The implication of this finding is 
that the internet, which is already exploited for distributed computing for mostly 
"embarrassingly parallel" problems through existing GRID-based schemes |T7| I16j. 
may have the potential to accommodate efficient complex system simulations (such 
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as asynchronous PDES) where the nodes frequently have to synchronize with each 
other. An intriguing question to pursue is how the logarithmic divergence of the 
surface fluctuations observed here can be related to the collective behavior (in par- 
ticular, the finite-size effects of the magnetic susceptibility) of Ising ferromagnets on 
scale-free networks |14()[ I141[ I142[ 1143] with the same degree distribution. 

We also considered the extreme-height fluctuations in this prototypical model 
with local relaxation, unbounded local variables, and in the presence of exponen- 
tial or power-law tailed noise. We showed that when the interaction topology is 
extended to include random links in a SW fashion, the local height variables be- 
come effectively independent and the statistics of the extremes is governed by the 
FTG or the Frechet distribution, respectively. For both types of noise, the average 
width of the synchronization landscape becomes independent of the system size. 
The extreme fluctuations increase only logarithmically with the number of nodes 
for exponential-like noise and in a power-law fashion for the power-law noise. These 
findings directly addresses synchronizability in generic SW-coupled systems where 
relaxation through the links is the relevant node-to-node process and effectively 
governs the dynamics. We illustrated our results on an actual synchronizational 
problem in the context of scalable parallel simulations. 

Our findings are also closely related to critical phenomena and collective phe- 
nomena on networks [21 El ESI I144j . In particular, in recent years, a number of 
prototypical models have been investigated on SW networks [22 1221 EH EH 1^1 

123 123 123 EDI EH E21 E3 El EH1 IM1 ESI mg IHZl IHHl UMl mni HMJ. Of these, 

the ones most closely related to our work are the XY model [2E1> the EW model 
[23 EB1 EE] , and diffusion [23EBlEniEniEIlE21EIIlEI!onSW networks. The find- 
ings suggest that systems without inherent frustration exhibit (strict or anomalous) 
[23 EHl EH EH 115 lj mean-field-like behavior when the original short-range interac- 
tion topology is modified to a SW network. In essence, the SW couplings, although 
sparse, induce an effective relaxation to the mean of the respective local field vari- 
ables, and in turn, the system exhibits a mean-field-like behavior |151j . This effect 
is qualitatively similar to those observed in models with "annealed" long-range ran- 
dom couplings [321 EEH21 UHH] , but on (quenched) SW networks, the scaling properties 
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can differ from those with annealed interactions [27| 1251 I151j . 

5.2 Future Work 

In many real networks the cost of the random and arbitrarily long-range links 
could be unaffordable. Recent works show that in cortical j!54| and on-chip logic 
networks [155J power-law-suppressed link-length distributions are observed because 
these are spatially embedded networks with wiring-costs |156j . To study these net- 
works one can consider a special model of SW network with distance-dependent 
power-law probability distribution of long-range links, where the probability of two 
nodes being connected is r~ a , with a varying exponent a and the distance r between 
the nodes (23123- By varying the exponent a, one can control the distribution of the 
random links. Changing a changes the topology of the network from the "plain" SW 
in which there is no wiring cost as we discussed in Chapter 3 (a=0), to a short-range 
network in which only nearest neighbor links are present (a=oo). 

The synchronization problem described in the previous chapters can also be 
studied on this network. The roughness of the time-surface as a function of the 
exponent a can give valuable information about the conditions on the scalability of 
the PDES. One can follow the same way of constructing the network; one random 
link per PE (chosen with distance-dependent probability) in addition to the nearest 
neighbors. The width of the linear EW model obtained through adjacency matrix 
diagonalization exhibits a smooth transition from system-size independent width 
(plain SW) to the KPZ width dependence ((u> 2 )~iV) [Fig. 15. If a)]. Simulating the 
"microscopic" dynamics in the actual PDES (incrementing the "local" minima) 
can give rise to nonlinear effects, typical for the KPZ surface, as can be seen in 
Fig. 15. li b). We see the transition from plain SW to KPZ again but through critical 
instability. By analyzing the Fig. 15.1( b) we can argue that for some critical range of 
a, the scalability of the PDES is worse than the BCS scheme (short-range network). 
Another surface growth model, single-step model (KPZ class), has also the same 
effect suggesting that as long as there is nonlinearity in the growth, the similar 
peaks in the width will be seen. 

Another possible future work might be to study the scale-free Barabasi- Albert 
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Figure 5.1: (a) Width as a function of a for the EW model on a l/r a SW 
network, (b) Width as a function of a for the conservative 
synchronization dynamics on a l/r° SW network. 

network further. In the preferential attachment process to construct the BA network 

at each time step we added only one node to the network (m=l). This led to 

logarithmic (marginal) scalability of the PDES scheme. It might help to increase 

m in suppressing the virtual time fluctuations and thus making the PDES scheme 

fully scalable in this network as well. 
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APPENDIX A 
Steady- State Structure Factor in Linear Growth Models 



The time evolution of linearly interacting local field variables on a network can be 
written in the form of a Langevin equation as 

d t r i (t) = - 1 £r ij T j (t)+ri i (t) , (A.l) 

3 

where is the coupling matrix containing the topology of the network, and f]i(t) 

is the noise delta-correlated in space/time with zero average as expressed in the 
following equations 

(77,(0) = (A.2) 

and 

( Vi (t)r h (t'))=2D5 i , j 5(t-t'). (A.3) 
For the nearest-neighbor network r„ is the discrete Laplacian 

T ij = = 2 kj - k-ij ~ ■ ( A - 4 ) 

For the maximal-distance network 

y,i I''' • -(»,.y-», s/2j). (A.5) 
For the fully-connected network 

r« = r° + i(s itj - 1) . (A.6) 

We consider cases where is translationally invariant, i.e., it does not depend 
on i and j explicitly but depends only on the distance l=i — j between them, 

r y = r(< - j) = r(0 . (A.?) 
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The structure factor [as defined through Eq. (|2.9)) ] contains all the physics we 
need to describe the evolution of the network. One needs the Fourier transforms of 
the local field variables defined as 

N 

f k (t) = Y.e-^[r J {t)-m. (A.8) 
Taking the Fourier transformation of Eq. ()A.1|) one finds 

dthit) = -f{k)r k {t) + fj k {t) , (A.9) 

where T{k) and fj k (t) are the Fourier transforms of T(l) and r]i(t), respectively. The 
wave number k goes from 1 to A— 1 since we exclude the zero- mode contribution, 
fo(t)=0 for all t. We can see that the evolution decouples for different k values in 
Fourier space. The second moment of the Fourier transform of the noise is 

(Vk(t)Vk'(t')) = 2DN5 k+k ,, 6(t - t') . (A.10) 

Integrating Eq. (jA.9|) we obtain 

fdt) = e- f ^ f dt'e^WfjkCt') . (A.ll) 
Jo 

By using Eq. ()A.11|) one can write the equal-time correlations for the local 
field variables as 

(Ut)fy{t)) = e-™+nk')]tj dt , J ^(fcJf+W^^^^/)) . (A . 12) 

By substituting the second moment of the Fourier transform of the noise [Eq. ()A.10|) 
into Eq. (|A.12j) . and by using the basic property of the delta function 5(t' — t") in 
the integral, one obtains 

(h(t)f k ,(t)) = 2 J DAW,oe" [f f dt'e^ +f ^' . (A.13) 

J o 
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The integral in the equation above can be evaluated easily and one obtains 

jf(h)+f(k')]t _ I 

(h(t)f k ,(t)) = 2DN5 k+k , e-™+ ^ l 

r(fc) + r(fc') 

After rearranging the terms and rewriting the delta function as 5k+k',o=$k',-k, one 
obtains 

= fwTf('-fc) {1 " e ' [m+t{ ~ k) ^ ■ (A.15) 
From the present form of Eq. (jA.15j) and by using Eq. (J2.9|) one can deduce the 
general time-dependent structure factor as 

V ; T(k) + T(-ky 1 V ; 

In the steady-state (£— >oo) the structure factor becomes 

2D 

S(k) = ]xmS(k,t) = • (A.17) 

v ; v ; r(ife) + r(-ife) 

The steady-state structure factor can be calculated easily once the Fourier 
transform of the coupling function T(k) is known. Now we calculate the structure 
factors for a few simple interaction topologies. 



A.l Nearest-neighbor network 

The coupling matrix for the nearest-neighbor network is a Laplacian, 

Tij = T°. = 26ij - Si-ij - S i+ ij , (A. 18) 

and one can rewrite the equation above by using the distance-dependent coupling 
function 

T°(t) = 26 lt0 -5i,i-6 l ,- 1 , (A.19) 
then the Fourier transform of T°(l) becomes 

f°(Jfe) = (2 - e ik - e~ ik ) = 2[1 - cos(ife)] . (A.20) 
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So the structure factor is, as in Eq. (|2.1(J|) . 

S(k) = - ° — -. . (A.21) 

v ; 2[1 - cos(Jfe)] v ; 

A. 2 Maximal-distance network 

Starting with the coupling matrix for the maximal- distance network, one ob- 
tains the distance- dependent coupling function as 

r(i) = r°(0 + 7(^0 -s l4 ). (A.22) 

Then the Fourier transform T(k) becomes, 

r(ife) = r°(fc) + 7(1 - e~ ) = 2[1 - cos(Jfe)] + 7(1 - e~ ). . (A.23) 
By using Eq. (|A.17|) we obtain 



D 

2[1 - cos(ife)] +7[1 - cos(^f-)] 



S ( k ) = om — ■ - m iTTTfciVvi • ( A - 24 ) 



A. 3 Fully-connected network 

In this network, as we mentioned in Chapter 3, each node is connected to all 
other nodes with strength 7/iV and the coupling function is 

r(0=r°(0+7(«5i,o-i). (A.25) 

Its Fourier transform becomes 

f(ife) = f°(fc) + 7 = 2[1 - cos(ife)] + 7 . (A.26) 



Thus one can obtain the structure factor as 



